Google

Gemini 2.5 Flash-Lite pricing

The cheapest GA model we track. $0.10 per 1M input - fractional cents for a million-token prompt, ideal for bulk classification.

Input

$0.10/ 1M tok

Output

$0.40/ 1M tok

Context window
1M
Max output
-
Cached input
$0.010 / 1M
Verified
2026-04-06

Flash-Lite is the bottom of Google's GA line-up and, at $0.10 per 1M input and $0.40 per 1M output, the cheapest model Calcis tracks. The 1M context window is unchanged from the Flash tier, and there's no long-context surcharge - rates stay flat across the full window.

A 1,000-token prompt with a 500-token reply costs about $0.0003. Ten million of those requests costs $3,000 a month, which is often less than the infrastructure calling them. Cached input at $0.01 per 1M (a 90% discount) takes bulk repeat-prefix workloads close to zero marginal cost.

Calcis counts input tokens against Google's own countTokens API so the number matches Google's billing boundary exactly, and predicts response length from the prompt itself rather than asking you to guess.

Estimate your cost on Gemini 2.5 Flash-Lite

Paste your prompt into the estimator, pick Gemini 2.5 Flash-Lite, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

Is Gemini 2.5 Flash-Lite really the cheapest model?
Among the 25+ models Calcis tracks, yes - $0.10 per 1M input. GPT-5 nano at $0.05 is cheaper on input but more expensive on output. Pick Flash-Lite when you need a 1M context window at the floor price.
How much does Flash-Lite cost per request?
A 1,000-token prompt with a 500-token reply costs about $0.0003 ($0.0001 input + $0.0002 output). At 10 million requests a month, that's around $3,000.
Does Flash-Lite have a long-context surcharge?
No. The $0.10 / $0.40 per 1M rates apply across the full 1M context window. Only the Pro tier (2.5 Pro, 3.1 Pro) has dual-tier pricing above 200K input.
What's the catch with Flash-Lite?
No catch, but it's a smaller model - weaker on complex reasoning, long chain-of-thought, and anything requiring careful nuance. For bulk classification, routing, and simple chat it's effectively free; for hard tasks, use Flash or Pro.

Pricing verified 2026-04-06 from the provider's rate card.