OpenAI

GPT-4.1 pricing

Previous-generation flagship with a 1M context window. Cached input at a 75% discount - pay for prefixes only once.

Input

$2.00/ 1M tok

Output

$8.00/ 1M tok

Context window
1.0M
Max output
33K
Cached input
$0.500 / 1M
Verified
2026-04-06

GPT-4.1 was OpenAI's headline long-context release before the GPT-5 family. At $2 per 1M input and $8 per 1M output it's more expensive than GPT-5 on input ($2 vs $1.25) but cheaper on output ($8 vs $10). The 1M context window carries over to this generation, so it's a reasonable choice for document workloads that were benchmarked on this model.

The cached input rate of $0.50 per 1M is a 75% discount on the standard rate - less aggressive than the 90% discount OpenAI applies on GPT-5 and later, but still substantial on long repeated prefixes. Caching is automatic when prefixes repeat within a 5-minute window.

Calcis counts GPT-4.1 input tokens with o200k_base (tiktoken), matching the tokenizer OpenAI bills against, so the token numbers you see are exactly what lands on your invoice.

Estimate your cost on GPT-4.1

Paste your prompt into the estimator, pick GPT-4.1, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

Should I migrate from GPT-4.1 to GPT-5?
Usually yes. GPT-5 is cheaper on input ($1.25 vs $2), marginally more expensive on output ($10 vs $8), with a 400K context window. For long-document work where you need 1M context, stay on 4.1 or move to GPT-4.1-mini. Otherwise migrate to GPT-5.
How much does GPT-4.1 cost per request?
A 1,000-token prompt with a 500-token reply costs about $0.006 ($0.002 input + $0.004 output). The 4:1 output-to-input ratio is lighter than the 8:1 on GPT-5, so prompt-heavy workloads benefit more on 4.1.
Does GPT-4.1 have a long-context surcharge?
No. Flat per-token rates across the full 1M context window - no threshold like Gemini 2.5 Pro.
What's the cached input discount on GPT-4.1?
$0.50 per 1M cached tokens - a 75% discount on the standard $2 input rate. Automatic when prompt prefixes repeat within 5 minutes.

Pricing verified 2026-04-06 from the provider's rate card.