OpenAI
o4-mini pricing
Smaller reasoning model. ~2x cheaper than o3 with similar chain-of-thought depth for most tasks.
Input
$1.10/ 1M tok
Output
$4.40/ 1M tok
- Context window
- 200K
- Max output
- 100K
- Cached input
- $0.275 / 1M
- Verified
- 2026-04-06
o4-mini sits alongside o3 as the cost-optimised reasoning tier: $1.10 per 1M input and $4.40 per 1M output, 200K context window, 100K max output. Capability-wise it's a genuine step down from o3 - the model reasons on a shorter leash - but for a lot of structured extraction and multi-step code workloads the difference doesn't show up in end-to-end quality measurements.
Like o3, o4-mini bills reasoning tokens as part of output, so the visible answer length understates the actual cost. A response that looks like 200 tokens to you might cost as much as a 1,500-token reply from a non-reasoning model of comparable price. Calcis' output-length predictor accounts for the reasoning-token overhead so the dollar forecast doesn't under-count.
Input tokens are counted with o200k_base (tiktoken), the same tokenizer OpenAI uses for billing, so the input figure you see matches the one on your invoice exactly.
Estimate your cost on o4-mini
Paste your prompt into the estimator, pick o4-mini, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- When should I pick o4-mini over o3?
- When you want reasoning capability but can't absorb o3's cost. o4-mini is ~2x cheaper on both input and output. Capability gap is real but workload-dependent - benchmark both on your specific task before deciding.
- How much does o4-mini cost per request?
- Depends on how much the model thinks. A simple query with a 500-token visible answer runs about $0.003. A reasoning-heavy problem with 3,000 reasoning tokens + 500-token answer runs about $0.017 - same input, same visible answer, 5x higher bill.
- Are o4-mini reasoning tokens cacheable?
- No - reasoning tokens vary per request, so they can't be cached. The cached input discount ($0.275 per 1M, 75% off) applies only to the input side of your request.
- Is o4-mini cheaper than GPT-5?
- On headline rates, o4-mini ($1.10 input) is cheaper than GPT-5 ($1.25 input) and roughly matches on output. But reasoning tokens push the effective output count higher on o-series, so end-to-end cost usually lands above GPT-5 for equivalent workloads.
Pricing verified 2026-04-06 from the provider's rate card.