OpenAI
GPT-4o pricing
The 2024 flagship that's still in production everywhere. Multimodal, 128K context, priced at the same input as GPT-5.4.
Input
$2.50/ 1M tok
Output
$10.00/ 1M tok
- Context window
- 128K
- Max output
- 16K
- Cached input
- $1.250 / 1M
- Verified
- 2026-04-06
GPT-4o was OpenAI's multimodal flagship through 2024 and is still pulling heavy production traffic at $2.50 per 1M input and $10 per 1M output. The 128K context window is small compared to the 1M windows on GPT-4.1 and GPT-5.4, but for chat-shaped workloads it's usually plenty. Native multimodal (image + audio input) support is the reason most 4o deployments don't migrate even when a cheaper text-only model would work.
Cached input at $1.25 per 1M (a 50% discount) is less aggressive than newer OpenAI models' 75-90% discounts, but still worth wiring up for long system prompts. For pure text workloads, GPT-5 at $1.25 / $10 is cheaper on input and the same on output - a straightforward upgrade path.
Calcis counts GPT-4o input tokens with o200k_base (tiktoken), the same tokenizer OpenAI bills against, and handles image and audio tokens separately when you include them in a multimodal request.
Estimate your cost on GPT-4o
Paste your prompt into the estimator, pick GPT-4o, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- Should I migrate from GPT-4o to GPT-5?
- For text-only workloads, yes - GPT-5 is cheaper on input ($1.25 vs $2.50) and matches on output. For multimodal (image/audio) workloads, GPT-4o is still the baseline until you test whether GPT-5's multimodal matches your specific use case.
- How much does GPT-4o cost per request?
- A 1,000-token prompt with a 500-token reply costs about $0.0075 ($0.0025 input + $0.005 output). Image inputs are priced separately - check the OpenAI rate card for image token conversion.
- Is GPT-4o multimodal?
- Yes - native image input, audio input, and audio output. Text output and image generation land at the standard per-token rate; audio tokens and image tokens have their own rate cards.
- What's the cached input discount on GPT-4o?
- $1.25 per 1M cached tokens - a 50% discount on the standard $2.50 input rate. Less aggressive than newer models but still substantial for system-prompt-heavy workloads.
Pricing verified 2026-04-06 from the provider's rate card.