OpenAI

GPT-4o pricing

The 2024 flagship that's still in production everywhere. Multimodal, 128K context, priced at the same input as GPT-5.4.

Input

$2.50/ 1M tok

Output

$10.00/ 1M tok

Context window
128K
Max output
16K
Cached input
$1.250 / 1M
Verified
2026-04-06

GPT-4o was OpenAI's multimodal flagship through 2024 and is still pulling heavy production traffic at $2.50 per 1M input and $10 per 1M output. The 128K context window is small compared to the 1M windows on GPT-4.1 and GPT-5.4, but for chat-shaped workloads it's usually plenty. Native multimodal (image + audio input) support is the reason most 4o deployments don't migrate even when a cheaper text-only model would work.

Cached input at $1.25 per 1M (a 50% discount) is less aggressive than newer OpenAI models' 75-90% discounts, but still worth wiring up for long system prompts. For pure text workloads, GPT-5 at $1.25 / $10 is cheaper on input and the same on output - a straightforward upgrade path.

Calcis counts GPT-4o input tokens with o200k_base (tiktoken), the same tokenizer OpenAI bills against, and handles image and audio tokens separately when you include them in a multimodal request.

Estimate your cost on GPT-4o

Paste your prompt into the estimator, pick GPT-4o, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

Should I migrate from GPT-4o to GPT-5?
For text-only workloads, yes - GPT-5 is cheaper on input ($1.25 vs $2.50) and matches on output. For multimodal (image/audio) workloads, GPT-4o is still the baseline until you test whether GPT-5's multimodal matches your specific use case.
How much does GPT-4o cost per request?
A 1,000-token prompt with a 500-token reply costs about $0.0075 ($0.0025 input + $0.005 output). Image inputs are priced separately - check the OpenAI rate card for image token conversion.
Is GPT-4o multimodal?
Yes - native image input, audio input, and audio output. Text output and image generation land at the standard per-token rate; audio tokens and image tokens have their own rate cards.
What's the cached input discount on GPT-4o?
$1.25 per 1M cached tokens - a 50% discount on the standard $2.50 input rate. Less aggressive than newer models but still substantial for system-prompt-heavy workloads.

Pricing verified 2026-04-06 from the provider's rate card.