Question 1

What are reasoning tokens and do I pay for them?

Accepted Answer

Reasoning tokens are the chain-of-thought o3 generates internally before producing the visible answer. They count as output tokens on your bill, so a response that looks like 200 tokens to you might be billed as 2,000 tokens if the model thought hard.

Question 2

How much does o3 cost per request?

Accepted Answer

Depends heavily on problem difficulty. A simple query with a 500-token reply runs about $0.006. A hard reasoning problem with 5,000 reasoning tokens + a 500-token answer runs about $0.046 - same input, same visible answer, 7x higher bill.

Question 3

Is o3 worth using over GPT-5?

Accepted Answer

For math, code, agentic loops, and structured reasoning, usually yes - o3 produces measurably better answers. For chat, summary, and routing, no - GPT-5 matches quality and costs less per response because there's no reasoning overhead.

Question 4

What's the cached input discount on o3?

Accepted Answer

$0.50 per 1M cached tokens - a 75% discount on the standard $2 input rate. Caching applies to the input side only; reasoning tokens aren't cacheable because they vary per request.

o3 pricing

Estimate your cost on o3

Frequently asked