OpenAI
GPT-5.4 mini pricing
GPT-5.4 at 1/3 the price. 400K context and the same tokenizer as full 5.4 - the right default for most production workloads.
Input
$0.75/ 1M tok
Output
$4.50/ 1M tok
- Context window
- 400K
- Max output
- 128K
- Cached input
- $0.075 / 1M
- Verified
- 2026-04-06
GPT-5.4 mini sits in the mid-tier slot OpenAI reserves for workhorse production traffic. $0.75 per 1M input and $4.50 per 1M output is roughly 3x cheaper than full 5.4 and 3x more expensive than nano. The 400K context window is smaller than 5.4's 1M but still large enough for most realistic document workflows.
Cached input at $0.075 per 1M (a 90% discount) makes repeated-prefix workloads cheap. Classification, chat, routing, and light summarisation tasks that'd run fine on 4o mini can typically step up to 5.4 mini for better reasoning without moving the bill significantly - depending on traffic shape it might even land lower.
Calcis counts GPT-5.4 mini input tokens with o200k_base (tiktoken), the same tokenizer OpenAI bills against, so the token count on your screen matches the one on your invoice.
Estimate your cost on GPT-5.4 mini
Paste your prompt into the estimator, pick GPT-5.4 mini, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- How much does GPT-5.4 mini cost per request?
- A 1,000-token prompt with a 500-token reply costs about $0.003 ($0.00075 input + $0.00225 output). At 100,000 requests a month, that's around $300 in API fees.
- Is GPT-5.4 mini better than GPT-4o mini?
- On most benchmarks, yes - newer generation, better reasoning. On price, GPT-4o mini is cheaper ($0.15 / $0.60 vs $0.75 / $4.50 per 1M). Pick 5.4 mini when you need the reasoning lift; stay on 4o mini for pure price-sensitive workloads.
- Does GPT-5.4 mini support the full 1M context?
- No - the context window is 400K tokens vs 1M on the full GPT-5.4. Max output is 128K, same as the full model.
- What's the cached input discount on GPT-5.4 mini?
- $0.075 per 1M cached tokens - a 90% discount on the standard $0.75 input rate. Applied automatically when prompt prefixes repeat within 5 minutes.
Pricing verified 2026-04-06 from the provider's rate card.