OpenAI
OpenAI API Pricing (2026)
The GPT-5 series, GPT-4.1/4o, and the o-series reasoning models all on one rate card. Compute costs at today's prices, see how batch and caching discounts change the bill, and pick the right tier for your workload.
Pricing verified 2026-04-06
Complete rate card
Every OpenAI model Calcis tracks, sorted cheapest to most expensive on a typical chat-shape request (1k in, 2k out). Rates are USD per 1M tokens.
| Model | Input / 1M | Output / 1M | Context | Max out | Details |
|---|---|---|---|---|---|
| GPT-5 nano | $0.050 | $0.400 | 400K | 128K | View → |
| GPT-4o mini | $0.150 | $0.600 | 128K | 16K | View → |
| GPT-5.4 nano | $0.200 | $1.25 | 400K | 128K | View → |
| GPT-4.1 mini | $0.400 | $1.60 | 1.0M | 33K | View → |
| GPT-5 mini | $0.250 | $2.00 | 400K | 128K | View → |
| GPT-5.4 mini | $0.750 | $4.50 | 400K | 128K | View → |
| o4-mini | $1.10 | $4.40 | 200K | 100K | View → |
| GPT-4.1 | $2.00 | $8.00 | 1.0M | 33K | View → |
| o3 | $2.00 | $8.00 | 200K | 100K | View → |
| GPT-5 | $1.25 | $10.00 | 400K | 128K | View → |
| GPT-4o | $2.50 | $10.00 | 128K | 16K | View → |
| GPT-5.4 | $2.50 | $15.00 | 1.1M | 128K | View → |
Every OpenAI model
Each page has a headline price card, cost ladder, a narrative on when to reach for that model, and an FAQ with the common pricing questions answered from the live rate card.
GPT-5 nano
400K ctx$0.050 in · $0.400 out · per 1M tok
GPT-4o mini
128K ctx$0.150 in · $0.600 out · per 1M tok
GPT-5.4 nano
400K ctx$0.200 in · $1.25 out · per 1M tok
GPT-4.1 mini
1.0M ctx$0.400 in · $1.60 out · per 1M tok
GPT-5 mini
400K ctx$0.250 in · $2.00 out · per 1M tok
GPT-5.4 mini
400K ctx$0.750 in · $4.50 out · per 1M tok
o4-mini
200K ctx$1.10 in · $4.40 out · per 1M tok
GPT-4.1
1.0M ctx$2.00 in · $8.00 out · per 1M tok
o3
200K ctx$2.00 in · $8.00 out · per 1M tok
GPT-5
400K ctx$1.25 in · $10.00 out · per 1M tok
GPT-4o
128K ctx$2.50 in · $10.00 out · per 1M tok
GPT-5.4
1.1M ctx$2.50 in · $15.00 out · per 1M tok
What does GPT-5 nano vs GPT-5.4 actually cost?
Four workload shapes from tiny to massive. Output is roughly half the input (a typical chat/completion pattern). The ratio column shows the OpenAI spread at a glance.
| Scenario | Tokens (in / out) | GPT-5 nano | GPT-5.4 | Ratio |
|---|---|---|---|---|
| Tiny | 100 / 50 | $0.000025 | $0.00100 | 40.0× |
| Short request | 1,000 / 500 | $0.00025 | $0.0100 | 40.0× |
| Long document | 10,000 / 5,000 | $0.00250 | $0.1000 | 40.0× |
| Massive context | 100,000 / 50,000 | $0.0250 | $1.00 | 40.0× |
Discounts & modifiers
Prompt caching. OpenAI caches the static prefix of your prompt automatically for any prompt with a shared prefix of at least 1,024 tokens. Cached input tokens bill at a 90% discount on GPT-5.4 ($0.250 / 1M vs $2.50 base) and at similar ratios across every GPT-4o/4.1/5 generation. No code changes required - just keep your system prompt static at the top of the message array.
Batch API discount.A 50% flat discount on both input and output for non-interactive workloads that can wait up to 24 hours (most finish in minutes). Calcis' estimator does not apply the batch discount by default because most live user requests are real-time, but divide the predicted cost by 2 for any workload that runs through the Batch endpoint.
Reasoning tokens (o-series). The o3 and o4-mini models bill internal reasoning tokens at the standard output rate. A typical o3 request emits 3-10x the visible output in hidden reasoning before producing the answer, so an o3 call priced at $8/1M output can easily bill like an $80/1M model in practice. For workloads that do not need chain-of- thought, a cheaper non-reasoning model is usually better value.
The spread between cheapest (GPT-5 nano) and most expensive (GPT-5.4) at the same chat-shape request is roughly 38×, which is why picking the right OpenAI model matters more than picking the right provider.
Which OpenAI model should I use?
For simple classification and extraction tasks, start with GPT-5 nano or GPT-4o mini - both cost cents per thousand requests and handle structured output well. Neither is a reasoning model, but most classification workloads do not need one.
For production chat and agentic workloads, GPT-5.4 mini is the sweet spot: much cheaper than full GPT-5.4 and 5 while staying in the frontier-capability bracket. Flip to GPT-5.4 only when the task visibly benefits from the extra capacity.
For hard reasoning (complex code, multi-step math, ambiguous requirements), o3 or o4-mini justify their price tag. Reasoning tokens are billed at output rates, which is why o3 requests look expensive - they are, but the alternative is often a cheaper model that fails the task and burns more tokens retrying.
GPT-4o is the right pick when you need multimodal input (image/audio) at a Pro-tier price. GPT-4.1 kept its place for developers who wired their prompts to its specific long-context behavior.
Estimate your OpenAI costs →
Drop a prompt into the estimator, pick any OpenAI model, and get the exact dollar cost - input tokens counted with OpenAI's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- How much does GPT-5.4 cost per 1M tokens?
- GPT-5.4 costs $2.50 per 1M input tokens and $15.00 per 1M output tokens. Cached input drops to $0.250 per 1M - a ~90% discount on repeated prompt prefixes.
- What is the cheapest OpenAI model?
- GPT-5 nano is the cheapest OpenAI model on a typical chat-shape request. At $0.05 input / $0.40 output per 1M tokens, it is cheap enough to run as a background classifier or a high-volume summariser without blowing the budget.
- Does OpenAI offer batch pricing discounts?
- Yes. The OpenAI Batch API offers a 50% discount on both input and output tokens in exchange for up to 24-hour completion (most jobs finish much faster). Batch is available for every chat-completion and embeddings model. If your workload is not latency-sensitive - nightly report generation, bulk embeddings, research corpora - the batch discount is the single biggest lever on your bill.
- How does OpenAI pricing compare to competitors?
- At the flagship tier, GPT-5.4 ($15/1M output) sits between Anthropic's Claude Opus 4.7 ($25/1M) and Google's Gemini 2.5 Pro ($10/1M). At the bottom of the rate card, GPT-5 nano undercuts everything except Gemini Flash-Lite tiers.
- What is the OpenAI free tier?
- OpenAI does not offer a free API tier today - every request is billed. For experimentation, the platform offers usage credits on new accounts ($5-$18 at the time of writing, varies by region). Beyond that, pricing is pay-as-you-go with monthly hard caps configurable in the OpenAI dashboard. Calcis does not charge to estimate costs; the free tier here lets you price prompts before you spend anything at OpenAI.
Pricing verified 2026-04-06 from OpenAI's published rate card.