OpenAI
GPT-5.4 nano pricing
The cheapest GPT-5.4 tier. For high-volume chat and routing where latency and cost matter more than depth.
Input
$0.20/ 1M tok
Output
$1.25/ 1M tok
- Context window
- 400K
- Max output
- 128K
- Cached input
- $0.020 / 1M
- Verified
- 2026-04-06
GPT-5.4 nano is the floor of the GPT-5.4 family: $0.20 per 1M input and $1.25 per 1M output, 400K context window, 128K max output. At these rates a 1,000-token prompt with a 500-token reply costs about $0.0009 - less than a tenth of a cent per call.
Nano sits in the slot where you used to put gpt-3.5: high-volume chat, log enrichment, classification, routing, simple summarisation. The 400K context means you can stuff most realistic workloads in without thinking, and cached input at $0.02 per 1M (a 90% discount) makes repeated-prefix cases effectively free.
Calcis counts GPT-5.4 nano input tokens with o200k_base (tiktoken), the same tokenizer OpenAI uses for billing, so the token numbers you see match the invoice exactly.
Estimate your cost on GPT-5.4 nano
Paste your prompt into the estimator, pick GPT-5.4 nano, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- How much does GPT-5.4 nano cost per request?
- A 1,000-token prompt with a 500-token reply costs about $0.0009 ($0.0002 input + $0.000625 output). At 10 million requests a month, that's around $9,000.
- Is GPT-5.4 nano better than GPT-5 nano?
- Newer generation - so typically a modest reasoning lift on complex tasks. GPT-5 nano is cheaper on input ($0.05 vs $0.20 per 1M) but more expensive on cached input ($0.005 vs $0.02). Pick based on whether the reasoning lift matters or your traffic is latency-sensitive.
- Does GPT-5.4 nano support the full 1M context?
- No - 400K context window, same as GPT-5.4 mini. Max output is 128K. For workloads with million-token prompts, use full GPT-5.4 or GPT-4.1.
- When should I not use GPT-5.4 nano?
- Any task requiring careful multi-step reasoning, nuanced judgment, or long chain-of-thought. Nano is fast and cheap but not deep - for reasoning workloads, use o3, o4-mini, or full GPT-5.4.
Pricing verified 2026-04-06 from the provider's rate card.