OpenAI

o3 pricing

OpenAI's reasoning model. Chain-of-thought built in - you pay for the thinking, not just the answer.

Input

$2.00/ 1M tok

Output

$8.00/ 1M tok

Context window
200K
Max output
100K
Cached input
$0.500 / 1M
Verified
2026-04-06

o3 is OpenAI's reasoning tier - the model pays itself to think through problems before answering, and you pay for those reasoning tokens as part of the output. At $2 per 1M input and $8 per 1M output, the headline price is identical to GPT-4.1, but the bill shape is different: o3 routinely produces many more output tokens per response because the chain-of-thought is real and the cost of that thinking lands in the output count.

For problems where reasoning quality matters - math, multi-step code, agentic tool use, careful structured extraction - o3 typically pays for itself despite the inflated output. For chat, summary, and routing, stay on GPT-5 or GPT-5 mini; there's no capability lift to justify the thinking overhead.

Calcis counts o3 input tokens with o200k_base (tiktoken) and predicts response length from the prompt. The prediction accounts for the reasoning token overhead on o-series models so the dollar forecast doesn't under-count.

Estimate your cost on o3

Paste your prompt into the estimator, pick o3, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

What are reasoning tokens and do I pay for them?
Reasoning tokens are the chain-of-thought o3 generates internally before producing the visible answer. They count as output tokens on your bill, so a response that looks like 200 tokens to you might be billed as 2,000 tokens if the model thought hard.
How much does o3 cost per request?
Depends heavily on problem difficulty. A simple query with a 500-token reply runs about $0.006. A hard reasoning problem with 5,000 reasoning tokens + a 500-token answer runs about $0.046 - same input, same visible answer, 7x higher bill.
Is o3 worth using over GPT-5?
For math, code, agentic loops, and structured reasoning, usually yes - o3 produces measurably better answers. For chat, summary, and routing, no - GPT-5 matches quality and costs less per response because there's no reasoning overhead.
What's the cached input discount on o3?
$0.50 per 1M cached tokens - a 75% discount on the standard $2 input rate. Caching applies to the input side only; reasoning tokens aren't cacheable because they vary per request.

Pricing verified 2026-04-06 from the provider's rate card.