Converter

Tokens to API cost calculator

Fine-grained API cost per request. Pick a model, enter tokens, see input cost, output cost, and total — with tokenizer multipliers and long-context tiers applied.

Pick a model, enter token counts, see the dollar cost

Input cost

$0.00300

Output cost

$0.00750

Total

$0.01050

Input: $3.00/ 1M ·  Output: $15.00/ 1M 

How the conversion works

Same math as the tokens-to-dollars converter but with a three-way breakdown (input cost / output cost / total) so you can see where the bill actually lands. Useful when deciding whether to trim the input (reduce context, truncate history) or trim the output (max_tokens cap).

All rates come from the live Calcis pricing table, which is verified against each provider's published rate card and updated when providers change their pricing.

Where does the cost land?

Input-heavy workload (10K in / 500 out)~90% of cost is input
Balanced workload (1K in / 1K out)~80% of cost is output
Output-heavy workload (500 in / 2K out)~95% of cost is output
Chat default (700 in / 300 out)~60% of cost is output
Summarization (10K in / 500 out)~75% of cost is input
RAG (4K context + 200 query / 400 out)~65% of cost is input

Frequently asked

Is this different from the tokens-to-dollars converter?

Same math, different UI. This one breaks the total into input cost / output cost / total so you can see which side of the ledger dominates. Use /convert/tokens-to-dollars for a simpler view.

When should I optimize input vs output?

If your workload is input-heavy (summarization, long context), optimize context size (trim history, rerank retrieval, smaller chunks, caching). If output-heavy (creative generation, long replies), optimize max_tokens and model choice.

Why is output so much more expensive than input?

Generation requires a full forward pass per output token, while input tokens are processed in parallel. Output rates are typically 4-10× the input rate across every provider. For a 50/50 workload, 80% of cost will be output.

What counts as 'input' exactly?

Everything sent to the model: system prompt + user message + prior messages + tool schemas + any structured-output JSON schema. OpenAI and Anthropic charge a small overhead for message formatting too, which this converter doesn't model (it's usually <5%).

Does this include fine-tuning costs?

No. Fine-tuning has a one-time training cost plus a usually-higher per-token inference cost. This converter only handles base-model inference. For fine-tuning, see each provider's pricing page directly.

Ready to estimate a real prompt?

Paste your actual text into the estimator for exact token counts and dollar costs across every model.