Question 1

Should I migrate from GPT-4.1 to GPT-5?

Accepted Answer

Usually yes. GPT-5 is cheaper on input ($1.25 vs $2), marginally more expensive on output ($10 vs $8), with a 400K context window. For long-document work where you need 1M context, stay on 4.1 or move to GPT-4.1-mini. Otherwise migrate to GPT-5.

Question 2

How much does GPT-4.1 cost per request?

Accepted Answer

A 1,000-token prompt with a 500-token reply costs about $0.006 ($0.002 input + $0.004 output). The 4:1 output-to-input ratio is lighter than the 8:1 on GPT-5, so prompt-heavy workloads benefit more on 4.1.

Question 3

Does GPT-4.1 have a long-context surcharge?

Accepted Answer

No. Flat per-token rates across the full 1M context window - no threshold like Gemini 2.5 Pro.

Question 4

What's the cached input discount on GPT-4.1?

Accepted Answer

$0.50 per 1M cached tokens - a 75% discount on the standard $2 input rate. Automatic when prompt prefixes repeat within 5 minutes.

GPT-4.1 pricing

Estimate your cost on GPT-4.1

Frequently asked