Question 1

Why do the rates differ between input and output?

Accepted Answer

Output tokens cost 4-10× more because generation is inference-heavy: each output token requires a full forward pass through the model, whereas input tokens are processed in parallel. This is true across every provider.

Question 2

Does the calculator include cached-input discounts?

Accepted Answer

No: it uses the base rate so the number represents a worst-case read of the rate card. In practice, Anthropic's prompt cache cuts input cost 90%, OpenAI's cuts 50-75%, and Gemini's context caching cuts 75%. Apply the discount manually for your cache-hit rate.

Question 3

What's the tokenizer multiplier doing?

Accepted Answer

Claude Opus 4.7 shipped a retrained tokenizer that emits ~15% more tokens than Opus 4.6 for the same text. Calcis applies a 1.15× multiplier to Opus 4.7 estimates so the cost you see matches your actual bill. No other tracked model has an active multiplier.

Question 4

How does the Gemini long-context tier work?

Accepted Answer

Gemini 2.5 Pro charges base rates ($1.25 input, $10 output per 1M) up to 200K input tokens, and double rates ($2.50 / $15) above that. If your input is over 200K the converter applies the upper tier automatically.

Question 5

What about batch API discounts?

Accepted Answer

OpenAI and Anthropic both offer 50% off on batch API requests (24-hour async delivery). This converter shows real-time pricing. For batch workloads, divide the shown cost by 2: simple and correct.

1,000 input tokens	$0.0025
1,000 output tokens	$0.0150
10,000 tokens (5K in / 5K out)	$0.0875
100,000 tokens (50K in / 50K out)	$0.8750
1,000,000 tokens (500K / 500K)	$8.75
A 30-page paper summary (24K in / 500 out)	$0.068
A typical chat turn (700 in / 300 out)	$0.006

Tokens to dollars converter

Pick a model, enter token counts, see the dollar cost

How the conversion works

Cost landmarks at $2.50 input / $15 output (GPT-5)

Frequently asked

Ready to estimate a real prompt?