[calcis]

Last price update 3 months ago · Q2 2026 pricing report publishedSee changelog

Stop your agent before it spends what you can't afford.

Q: How accurate are the cost estimates?

Input token counts are exact. We use the same tokeniser as each provider. Output predictions are estimates based on prompt patterns; actual output varies by model and response.

Q: Which models are supported?

Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5.4, GPT-5, GPT-4o, Gemini 2.5 Pro, Gemini 2.5 Flash, and 17+ more. Updated when providers change pricing.

Q: Is my prompt data stored?

The raw text is never written to disk. We store a one-way SHA-512 hash plus token counts for 90 days (see our privacy policy). OpenAI tokenization runs locally; Claude and Gemini tokenization calls the providers' free countTokens endpoints, which sends the prompt to them purely for counting.

Q: What's free vs paid?

Free (Initium) gets unlimited token-exact counts on every supported model, the heuristic and regression cost predictors, the multi-model calculator, the workload calculators, the CLI, and the GitHub Action on public repos. Pro ($15/mo) adds the LLM-assisted predictor (Precise mode), the Bayesian P10 to P90 confidence band, the multi-turn session simulator, the context-file analyser, and a public REST API key. Max ($29/mo) adds prompt compression and higher quotas. Team ($40/seat/mo) adds pooled quotas and admin dashboards. Full breakdown lives on the pricing page.

Q: Who maintains Calcis?

Calcis is an independent project maintained by a single engineer (rc397 on GitHub). Every price change, source link, and changelog entry passes through one set of eyes before it publishes - no scraped content, no distributed editorial board. The tradeoff is that updates are manual; the benefit is that the dataset has a consistent, auditable hand behind every row.

Q: How do you detect when a provider changes a price?

A mix of provider blog RSS feeds, direct watches on the OpenAI / Anthropic / Google pricing pages, and manual monitoring of their changelog and release channels. Every published change is cross-referenced against at least one official provider URL before it lands in the dataset - you can see that source link on every entry of the public changelog.

Q: Why trust Calcis over a provider's own pricing page?

You shouldn't - the provider's own page is always the authoritative source. Calcis is an aggregator: it saves you from clicking through three vendor sites, surfaces moves the day they happen, and lets you diff the landscape across providers on an identical workload. Every model row links back to the provider's pricing page so two clicks take you to the ground truth.

Q: Which tokeniser does Calcis use for each model?

OpenAI models (GPT-4o, GPT-5, GPT-5.4) use o200k_base via js-tiktoken, computed locally so no prompt text leaves your browser. Legacy GPT-4 and GPT-3.5 fall back to cl100k_base (same library). Anthropic Claude models call the provider's free messages.countTokens endpoint with a 4s timeout and a heuristic fallback. Google Gemini models use the countTokens method on the Generative Language API with the same timeout and fallback behaviour.

Pre-flight cost guardrails for every LLM API call. Predict spend, set budgets, block PRs that blow them. 28 models across OpenAI, Anthropic, and Google.

Open the estimator View price ledger

28Models tracked

<6hMedian announce→liveLast: 3mo ago

149Price changes logged

Works with

OpenAIAnthropicGoogle

Live comparison· Code review

Prompt

Demo cycles every 8s · edit to pin. · 87 in / ~35 out tokens

#ModelCost / call

GPT-5 nanoopenai

<$0.0001

Gemini 2.5 Flash-Litegoogle

<$0.0001

GPT-4.1openai

$0.0005

Claude Opus 4.7anthropic

$0.0015

Claude Opus 4.1DEPRECATEDanthropic

$0.0039

214.2× spread between cheapest (GPT-5 nano) and most expensive (Claude Opus 4.1).

Estimated offline · exact counts on /estimatorOpen full estimator

The problem

Most cost tools show you the bill. Calcis prevents it.

Agent loops, runaway batch jobs, and RAG queries that 10x context overnight are the three patterns that break budgets. Calcis catches all three before the call leaves your code.

Observability tools surface the spend after it has already happened. By the time the dashboard turns red, the invoice has already moved. We sit one step earlier in the loop: predict the cost, set the cap, block the PR before the regression ships.

The name is older than the category. Roman merchants counted with calculi, small pebbles, before committing to a deal. The cost was known first. Always first.

The etymology in full

Monthly invoice

$847.23

OpenAI API, billed after the fact.

No warning. No preview.

Why Calcis

Three jobs the guardrail does.

Predict

Predict the spend

Exact tokens for OpenAI and Google models, calibrated approximation for Anthropic until we wire their token-counting API back up. Output prediction with a P10 to P90 confidence band trained on 33,000 real prompt-response pairs. The cost number lands before the call does.

How prediction works

Cap

Set the budget

Per-route, per-model, per-environment caps. Calcis knows what your prompt will cost. You decide what 'too much' is.

See the tiers

Block

Block the breach

GitHub Action fails the PR when predicted spend crosses the line. CLI exits non-zero. VS Code shows the warning inline. Catch it in dev, not in the bill.

See how it works in CI

What the guardrail catches

Three patterns that break budgets.

The same three failure modes account for almost every surprise invoice we hear about. Calcis sees each one before the call leaves your code.

Agent loops

Your LangGraph agent hits a recursion edge case and runs 50 iterations on what should have been 3. Calcis caps the loop before the bill catches up.

See the pattern →

Context bloat

Your RAG retriever pulls 40 chunks instead of 8 because someone tweaked the threshold. Calcis flags the cost spike on the next PR.

See the pattern →

Model creep

Someone routed a classification step to Opus. Calcis shows the 200x cost spread vs Haiku before you ship.

See the pattern →

Open source on GitHub npx calcisTracking 28 models since March 2023

Calcis Price Index

Frontier model pricing, live.

Ranked by a 1K in / 500 out token benchmark. Updated 3 months ago.

View full index

#ModelIn / 1MOut / 1MPer call

GPT-5 nano

OpenAI · 400K ctx

$0.05$0.40$0.0003Use

Gemini 2.5 Flash-Lite

Google · 1M ctx

$0.10$0.40$0.0003Use

GPT-4o mini

OpenAI · 128K ctx

$0.15$0.60$0.0004Use

GPT-5.4 nano

OpenAI · 400K ctx

$0.20$1.25$0.0008Use

Gemini 3.1 Flash-Lite (preview)PREVIEW

Google · 1M ctx

$0.25$1.50$0.0010Use

GPT-5.5 nano

OpenAI · 400K ctx

$0.25$1.50$0.0010Use

GPT-4.1 mini

OpenAI · 1M ctx

$0.40$1.60$0.0012Use

GPT-5 mini

OpenAI · 400K ctx

$0.25$2.00$0.0013Use

01GPT-5 nano

OpenAI · 400K ctx

Per call

$0.0003

In $0.05 · Out $0.40Use

02Gemini 2.5 Flash-Lite

Google · 1M ctx

Per call

$0.0003

In $0.10 · Out $0.40Use

03GPT-4o mini

OpenAI · 128K ctx

Per call

$0.0004

In $0.15 · Out $0.60Use

04GPT-5.4 nano

OpenAI · 400K ctx

Per call

$0.0008

In $0.20 · Out $1.25Use

05Gemini 3.1 Flash-Lite (preview)PREVIEW

Google · 1M ctx

Per call

$0.0010

In $0.25 · Out $1.50Use

06GPT-5.5 nano

OpenAI · 400K ctx

Per call

$0.0010

In $0.25 · Out $1.50Use

07GPT-4.1 mini

OpenAI · 1M ctx

Per call

$0.0012

In $0.40 · Out $1.60Use

08GPT-5 mini

OpenAI · 400K ctx

Per call

$0.0013

In $0.25 · Out $2.00Use

Showing 8 of 28 models

Prices sourced direct from provider docs · Methodology

Why trust the numbers

Tracked, timestamped, verifiable.

Every price on Calcis resolves to a source and a date. If a provider moves a number, the change shows up here first, and you can audit the whole history at any time.

149Price changes loggedevery one dated + sourced

28Models trackedacross the 3 major providers

52Source URLs citedclickable provider backlinks

3Providers coveredOpenAI · Anthropic · Google

3+Years of historyoldest entry, today

From the Q2 2026 Pricing Report

“At 100K input tokens, 7.4× separates the cheapest flagship (Gemini 3.1 Pro (preview) at $0.22) from the most expensive (Claude Opus 4.1 at $1.65) on the same 2K-output shape. Anyone shipping RAG or document analysis at scale should be benchmarking across providers, not just across models.”

Calcis Q2 2026 Pricing Report · 28-model landscapeRead the report

Recent price changes

See all →

149 changes logged across 52 entries. No retroactive edits.

Hours, not days

New provider prices land on Calcis within hours of announcement. Every update is timestamped on the public changelog.

View changelog

Auditable trail

Every price change is logged with its date, what moved, and a link to the provider announcement that triggered it. No silent edits.

See every change

Source-linked

Each model row carries a direct link to the provider's own pricing page. Anything we publish, you can verify in two clicks.

Browse model sources

Machine-readable

The full pricing dataset ships as CSV and JSON-LD alongside a public RSS feed. Build on it, diff it, pipe it into your own tools.

Subscribe to RSS

By workload

Pick the shape of the question.

Each calculator ranks every tracked model cheapest-first for that specific workload, with sensible defaults and tuning inputs.

Classification API cost calculator

Tagging, labelling, routing, and moderation workloads. Short inputs, tiny outputs, huge volume: exactly the shape the cheapest models were built for.

Open calculator →

Embedding API cost calculator

Compute the cost of embedding a corpus. One-off index builds, continuous indexing, and query-time embedding are all priced here.

Open calculator →

Coding agent cost calculator

Agentic coding loops are the most expensive LLM workload there is: many tool calls, big context windows, long outputs. Cost yours honestly.

Open calculator →

Summarisation API cost calculator

Cost out a batch of document summaries across every major LLM. Long input, short output: the inverse of a chatbot.

Open calculator →

Chatbot API cost calculator

Estimate what a production chatbot costs per conversation and per month across every major LLM: GPT-5, Claude, Gemini: with realistic defaults.

Open calculator →

RAG pipeline cost calculator

Cost out a retrieval-augmented generation pipeline: embedding queries, retrieving chunks, and paying for the context-heavy chat call on top.

Open calculator →

See all on the workload calculators page.

Pricing

Free for the basics. Pro for the predictor.

Yearly billing on every paid tier. Team and Enterprise on the full pricing page.

Free

Initium

Token-exact counts and the regression predictor.

Open the estimator

Pro

Certitudo

$15 / mo

LLM-assisted predictor, confidence band, REST API.

Reach certainty

Max

Potestas

$29 / mo

Prompt compression, higher quotas, priority email.

Unlock Max

Full feature comparison and FAQ on the pricing page.

Where to use it

Pipe Calcis into your stack.

Calcis isn't only a website. Four live surfaces wrap the same pricing dataset so you can estimate cost from CI, your terminal, your own API, or your RSS reader.

GitHub Action

Drop into any CI pipeline. Posts cost estimates on every pull request in under 30 seconds.

rc397/calcis-action@v1

See the workflow

CLI

Token counts and cost estimates from your terminal. No API key for the free tier.

npx calcis

View on npm

REST API

Key-scoped estimate endpoint. JSON in, JSON out. Same numbers the site renders.

POST /api/v1/estimate

API reference

Pricing feed

Subscribe to every change by RSS, or pull the dataset as an npm package to diff in your own tools.

@calcis/pricing

Subscribe to RSS

Works where you work

One step in your CI pipeline

Add Calcis to your GitHub workflow in 30 seconds.

.github/workflows/cost-estimate.ymlv1 · pinned

name: LLM Cost Estimate
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  estimate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: rc397/calcis-action@v1
        with:
          api-key: ${{ secrets.CALCIS_API_KEY }}
          model: claude-sonnet-4-6

Action source: rc397/calcis-action· MIT licensed · moves to calcis/calcis-action once the GitHub org is approved.

Get your API key at calcis.dev/dashboard

Calcis LLM Cost Estimate

FileTokensEst. CostModel

prompts/chat.txt1,247$0.0084claude-sonnet-4-6

prompts/system.txt423$0.0028claude-sonnet-4-6

Total1,670$0.0112claude-sonnet-4-6

For your README

Show LLM cost in your README.

Drop a tiny SVG badge into your README, blog post, or docs. It shows the per-call cost for any tracked model and updates the day your provider's price moves. No re-deploy needed.

README.mdmarkdown

![LLM cost](https://www.calcis.dev/api/badge?model=claude-sonnet-4-6&tokens=1000&output=500&label=cost)

Make a badge

Live rendering · 1K in / 500 out

Claude Sonnet 4.6

GPT-5.4

Gemini 2.5 Flash

These are the real badge endpoint: same SVGs any embed receives, cached 24h at the edge.

FAQ

Common questions

How accurate are the cost estimates?

Input token counts are exact. We use the same tokeniser as each provider. Output predictions are estimates based on prompt patterns; actual output varies by model and response.

Which models are supported?

Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5.4, GPT-5, GPT-4o, Gemini 2.5 Pro, Gemini 2.5 Flash, and 17+ more. Updated when providers change pricing.

Is my prompt data stored?

The raw text is never written to disk. We store a one-way SHA-512 hash plus token counts for 90 days (see our privacy policy). OpenAI tokenization runs locally; Claude and Gemini tokenization calls the providers' free countTokens endpoints, which sends the prompt to them purely for counting.

What's free vs paid?

Free (Initium) gets unlimited token-exact counts on every supported model, the heuristic and regression cost predictors, the multi-model calculator, the workload calculators, the CLI, and the GitHub Action on public repos. Pro ($15/mo) adds the LLM-assisted predictor (Precise mode), the Bayesian P10 to P90 confidence band, the multi-turn session simulator, the context-file analyser, and a public REST API key. Max ($29/mo) adds prompt compression and higher quotas. Team ($40/seat/mo) adds pooled quotas and admin dashboards. Full breakdown lives on the pricing page.

Who maintains Calcis?

Calcis is an independent project maintained by a single engineer (rc397 on GitHub). Every price change, source link, and changelog entry passes through one set of eyes before it publishes - no scraped content, no distributed editorial board. The tradeoff is that updates are manual; the benefit is that the dataset has a consistent, auditable hand behind every row.

How do you detect when a provider changes a price?

A mix of provider blog RSS feeds, direct watches on the OpenAI / Anthropic / Google pricing pages, and manual monitoring of their changelog and release channels. Every published change is cross-referenced against at least one official provider URL before it lands in the dataset - you can see that source link on every entry of the public changelog.

Why trust Calcis over a provider's own pricing page?

You shouldn't - the provider's own page is always the authoritative source. Calcis is an aggregator: it saves you from clicking through three vendor sites, surfaces moves the day they happen, and lets you diff the landscape across providers on an identical workload. Every model row links back to the provider's pricing page so two clicks take you to the ground truth.

Which tokeniser does Calcis use for each model?

OpenAI models (GPT-4o, GPT-5, GPT-5.4) use o200k_base via js-tiktoken, computed locally so no prompt text leaves your browser. Legacy GPT-4 and GPT-3.5 fall back to cl100k_base (same library). Anthropic Claude models call the provider's free messages.countTokens endpoint with a 4s timeout and a heuristic fallback. Google Gemini models use the countTokens method on the Generative Language API with the same timeout and fallback behaviour.

Pre-flight, not post-mortem.

Know what every LLM call will cost before you send it. 25+ models, three providers, prices verified and timestamped. No account needed to start.

Open the estimator Start free

Free forever tierNo credit card to startPrices direct from providers