Question 1

Which is cheaper, Gemini 2.5 Pro or Gemini 2.5 Flash?

Accepted Answer

On a typical 1,000-input / 2,000-output request, Gemini 2.5 Flash costs ~$0.0053 vs ~$0.0213 on Gemini 2.5 Pro. Input or output rates can flip the answer for very lopsided workloads - see the cost ladder above.

Question 2

What's the difference in per-token pricing?

Accepted Answer

Gemini 2.5 Pro charges $1.25 per 1M input tokens and $10.00 per 1M output tokens. Gemini 2.5 Flash charges $0.30 / $2.50 per 1M.

Question 3

Which has the bigger context window?

Accepted Answer

Gemini 2.5 Pro is larger (2M) vs 1M on the other.

Question 4

Is there a cached-input discount on either?

Accepted Answer

Gemini 2.5 Pro caches at $0.125 per 1M (90% off). Gemini 2.5 Flash caches at $0.030 per 1M (90% off). Workloads with repeated static prefixes see the biggest savings.

Question 5

Does Gemini 2.5 Pro have a long-context surcharge?

Accepted Answer

Yes. Above 200K input tokens, Gemini 2.5 Pro bills at $2.50 input / $15.00 output per 1M instead of the standard rate.

Question 6

How fresh is this comparison?

Accepted Answer

Gemini 2.5 Pro was re-verified on 2026-04-06 and Gemini 2.5 Flash on 2026-04-06 against each provider's published rate card. Calcis re-checks every row on a rolling schedule and re-deploys when a provider changes pricing.

Scenario	Tokens (in / out)	Gemini 2.5 Pro	Gemini 2.5 Flash	Winner
Short prompt	100 / 200	$0.0021	$0.0005	Gemini 2.5 Flash
Typical request	1,000 / 2,000	$0.0213	$0.0053	Gemini 2.5 Flash
Long document	10,000 / 5,000	$0.0625	$0.0155	Gemini 2.5 Flash
Large prompt	100,000 / 10,000	$0.2250	$0.0550	Gemini 2.5 Flash

Traffic	Req / month	Gemini 2.5 Pro	Gemini 2.5 Flash	Delta
Small SaaS	1,000	$21.25	$5.30	Gemini 2.5 Flash -$15.95
Growing product	10,000	$212.50	$53.00	Gemini 2.5 Flash -$159.50
Heavy usage	100,000	$2,125	$530.00	Gemini 2.5 Flash -$1,595

Gemini 2.5 Pro vs Gemini 2.5 Flash

Gemini 2.5 Pro

Gemini 2.5 Flash

Cost per request

Monthly bill at scale

Which should you use?

Live cost calculator

Try both in the estimator →

Frequently asked