Question 1

Is Gemini 2.5 Flash-Lite really the cheapest model?

Accepted Answer

Among the 25+ models Calcis tracks, yes - $0.10 per 1M input. GPT-5 nano at $0.05 is cheaper on input but more expensive on output. Pick Flash-Lite when you need a 1M context window at the floor price.

Question 2

How much does Flash-Lite cost per request?

Accepted Answer

A 1,000-token prompt with a 500-token reply costs about $0.0003 ($0.0001 input + $0.0002 output). At 10 million requests a month, that's around $3,000.

Question 3

Does Flash-Lite have a long-context surcharge?

Accepted Answer

No. The $0.10 / $0.40 per 1M rates apply across the full 1M context window. Only the Pro tier (2.5 Pro, 3.1 Pro) has dual-tier pricing above 200K input.

Question 4

What's the catch with Flash-Lite?

Accepted Answer

No catch, but it's a smaller model - weaker on complex reasoning, long chain-of-thought, and anything requiring careful nuance. For bulk classification, routing, and simple chat it's effectively free; for hard tasks, use Flash or Pro.

Gemini 2.5 Flash-Lite pricing

Estimate your cost on Gemini 2.5 Flash-Lite

Frequently asked