OpenAI

o4-mini pricing

Smaller reasoning model. ~2x cheaper than o3 with similar chain-of-thought depth for most tasks.

Input

$1.10/ 1M tok

Output

$4.40/ 1M tok

Context window
200K
Max output
100K
Cached input
$0.275 / 1M
Verified
2026-04-06

o4-mini sits alongside o3 as the cost-optimised reasoning tier: $1.10 per 1M input and $4.40 per 1M output, 200K context window, 100K max output. Capability-wise it's a genuine step down from o3 - the model reasons on a shorter leash - but for a lot of structured extraction and multi-step code workloads the difference doesn't show up in end-to-end quality measurements.

Like o3, o4-mini bills reasoning tokens as part of output, so the visible answer length understates the actual cost. A response that looks like 200 tokens to you might cost as much as a 1,500-token reply from a non-reasoning model of comparable price. Calcis' output-length predictor accounts for the reasoning-token overhead so the dollar forecast doesn't under-count.

Input tokens are counted with o200k_base (tiktoken), the same tokenizer OpenAI uses for billing, so the input figure you see matches the one on your invoice exactly.

Estimate your cost on o4-mini

Paste your prompt into the estimator, pick o4-mini, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

When should I pick o4-mini over o3?
When you want reasoning capability but can't absorb o3's cost. o4-mini is ~2x cheaper on both input and output. Capability gap is real but workload-dependent - benchmark both on your specific task before deciding.
How much does o4-mini cost per request?
Depends on how much the model thinks. A simple query with a 500-token visible answer runs about $0.003. A reasoning-heavy problem with 3,000 reasoning tokens + 500-token answer runs about $0.017 - same input, same visible answer, 5x higher bill.
Are o4-mini reasoning tokens cacheable?
No - reasoning tokens vary per request, so they can't be cached. The cached input discount ($0.275 per 1M, 75% off) applies only to the input side of your request.
Is o4-mini cheaper than GPT-5?
On headline rates, o4-mini ($1.10 input) is cheaper than GPT-5 ($1.25 input) and roughly matches on output. But reasoning tokens push the effective output count higher on o-series, so end-to-end cost usually lands above GPT-5 for equivalent workloads.

Pricing verified 2026-04-06 from the provider's rate card.