Google

Gemini 3.1 Flash-Lite (preview) pricing

The cheapest Gemini 3.x preview. 1M context at $0.25 per 1M input - almost too cheap to meter for routine classification and chat.

Input

$0.25/ 1M tok

Output

$1.50/ 1M tok

Context window
1M
Max output
-
Cached input
$0.025 / 1M
Verified
2026-04-06

Flash-Lite is Google's floor tier: $0.25 per 1M input and $1.50 per 1M output, 1M context window, no long-context surcharge. For bulk classification, log enrichment, routing decisions, and other workloads where you need a capable model but not a careful one, the math starts to look like “free.”

A 1,000-token prompt with a 500-token reply costs about $0.001. At ten million requests a month that's $10,000 - still cheap for a production-scale model, and a fraction of what the same traffic would cost on the Pro tier. Cached input at $0.025 per 1M (a 90% discount) pushes the repeat-prefix case close to zero.

Calcis calls Google's countTokens API for an exact input count and predicts response length from the prompt itself, so a full dollar forecast is available before the call fires.

Estimate your cost on Gemini 3.1 Flash-Lite (preview)

Paste your prompt into the estimator, pick Gemini 3.1 Flash-Lite (preview), and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

How much does Gemini 3.1 Flash-Lite cost per request?
A 1,000-token prompt with a 500-token reply costs about $0.001. At one million requests a month, that's around $1,000 in API fees - cheaper than the infrastructure that calls it for most teams.
Does Flash-Lite have a long-context surcharge?
No. The $0.25 / $1.50 per 1M rates apply across the full 1M context window. Long-context surcharges only exist on the Pro tier (3.1 Pro, 2.5 Pro) above 200K input tokens.
Is Flash-Lite cheaper than GPT-5 nano?
GPT-5 nano is cheaper on input ($0.05 vs $0.25) but more expensive on cached input ($0.005 is cheaper, but there's no caching for Flash-Lite's full rate). Output is close - $0.40 (GPT-5 nano) vs $1.50 (Flash-Lite). Pick on capability and ecosystem.
What tokenizer does Flash-Lite use?
Google's proprietary SentencePiece variant. Calcis uses Google's countTokens API for the authoritative count - the same boundary Google uses to bill.

Pricing verified 2026-04-06 from the provider's rate card.