[calcis]
EstimatorCalculatorSessionPricing

Product

  • Estimator
  • Calculator
  • Workload calculators
  • Token tools
  • Converters
  • Cost badges
  • Session

Resources

  • All models
  • Compare
  • Pricing
  • Q2 2026 report
  • API docs
  • Changelog
  • RSS feed

Company

  • How it works
  • The idea behind Calcis
  • GitHub Action
  • calcis.dev@gmail.com
© 2026 Calcis · LLM cost estimates before you run anything.

LLM API Pricing Calculator

Pick a model, enter your token counts, get the exact cost. Compare prices across every major provider in one place.

Paste a prompt instead →Analyse a context file →

Configure

OpenAI128k context16k max output

Cost per request

$0.0004
Input
$0.0001
$0.15/M tokens
Output
$0.0003
$0.6/M tokens

Volume projections

100 requests$0.0450
1,000 requests$0.4500
10,000 requests$4.50
100,000 requests$45.00

All models compared at 1k in / 500 out

ModelProviderInput rateOutput rateTotal cost
GPT-5 nanocheapestOpenAI$0.05/M$0.4/M$0.0003
Gemini 2.5 Flash-LiteGoogle$0.1/M$0.4/M$0.0003
GPT-4o miniOpenAI$0.15/M$0.6/M$0.0004
GPT-5.4 nanoOpenAI$0.2/M$1.25/M$0.0008
Gemini 3.1 Flash-Lite (preview)Google$0.25/M$1.5/M$0.0010
GPT-4.1 miniOpenAI$0.4/M$1.6/M$0.0012
GPT-5 miniOpenAI$0.25/M$2/M$0.0013
Gemini 2.5 FlashGoogle$0.3/M$2.5/M$0.0015
Gemini 3 Flash (preview)Google$0.5/M$3/M$0.0020
GPT-5.4 miniOpenAI$0.75/M$4.5/M$0.0030
o4-miniOpenAI$1.1/M$4.4/M$0.0033
Claude Haiku 4.5Anthropic$1/M$5/M$0.0035
GPT-4.1OpenAI$2/M$8/M$0.0060
o3OpenAI$2/M$8/M$0.0060
Gemini 2.5 ProGoogle$1.25/M$10/M$0.0063
GPT-5OpenAI$1.25/M$10/M$0.0063
GPT-4oOpenAI$2.5/M$10/M$0.0075
Gemini 3.1 Pro (preview)Google$2/M$12/M$0.0080
GPT-5.4OpenAI$2.5/M$15/M$0.0100
Claude Sonnet 4.6Anthropic$3/M$15/M$0.0105
Claude Sonnet 4.5Anthropic$3/M$15/M$0.0105
Claude Opus 4.6Anthropic$5/M$25/M$0.0175
Claude Opus 4.5Anthropic$5/M$25/M$0.0175
Claude Opus 4.7Anthropic$5/M$25/M$0.0201
Claude Opus 4.1Anthropic$15/M$75/M$0.0525

How token pricing works

LLM APIs charge per token, not per request. A token is roughly 3/4 of a word in English. Every API call has two cost components: input tokens (your prompt, system instructions, and any context you send) and output tokens (the model's response).

Output tokens are typically 3-15x more expensive than input tokens depending on the provider. This means the length of the model's response usually dominates your bill, not the length of your prompt.

Some providers offer additional pricing tiers. Google Gemini models switch to a higher rate when your input crosses a long-context threshold (usually 200k tokens). Anthropic and OpenAI offer discounted rates for cached inputs when the same prompt prefix is reused across requests.

The costs shown here are per-request. In production, costs compound quickly: a feature that makes 10 LLM calls per user action at 1,000 daily active users is 300,000 API calls per month. Use the volume projections above to plan accordingly.

Need to estimate costs from an actual prompt? The estimator counts exact tokens using each provider's own tokenizer and predicts the output length before you make the API call.