Gemini 2.5 Flash pricing
Google's mid-tier multimodal model. 1M-token context for ten cents on the dollar of Sonnet.
Input
$0.30/ 1M tok
Output
$2.50/ 1M tok
- Context window
- 1M
- Max output
- -
- Cached input
- $0.030 / 1M
- Verified
- 2026-04-06
Gemini 2.5 Flash is the model to reach for when you need a big context window without paying frontier prices. At $0.30 input / $2.50 output per 1M tokens, it's the cheapest credible option for document-scale workflows: 1M tokens of context costs you 30 cents to load, and a typical multi-paragraph response costs a fraction of a cent.
The trade-off is consistency. Flash is fast and cheap, but on hard reasoning tasks it lags Sonnet 4.6 and GPT-5. It's the right choice for: extracting structured data from long documents, summarising long inputs, batch classification of customer messages, and any agent loop where you'd otherwise burn money on Sonnet for tasks that don't need it.
Cached input is just $0.03 per 1M tokens (a 90% discount), which makes Flash particularly attractive for retrieval workflows where the same context window gets re-used across many follow-up queries. Calcis estimates your specific cost using Google's own tokenizer for input counting and a regression model for output prediction.
Estimate your cost on Gemini 2.5 Flash
Paste your prompt into the estimator, pick Gemini 2.5 Flash, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.
Frequently asked
- How much does Gemini 2.5 Flash cost per request?
- A 1,000-token prompt with a typical 500-token response costs about $0.00155 ($0.0003 input + $0.00125 output). For a 100,000-token document with a 1,000-token summary, the same call runs about $0.0325.
- Is there a long-context surcharge on Gemini 2.5 Flash?
- No - unlike Gemini 2.5 Pro, the Flash tier bills at flat $0.30 / $2.50 per 1M across the full 1M context window. This makes Flash significantly cheaper for very long inputs than Pro.
- How does Flash compare to GPT-5 mini?
- Flash is roughly the same input price ($0.30 vs $0.25), with a similar output rate ($2.50 vs $2.00). The big difference is context: Flash has 1M tokens vs GPT-5 mini's 400K. Pick Flash for document-scale work, GPT-5 mini for general-purpose chat.
- What does cached input cost on Flash?
- $0.03 per 1M tokens - a 90% discount on the standard input rate. Google applies prompt caching automatically when prefixes match across calls within the cache TTL.
Pricing verified 2026-04-06 from the provider's rate card.