Question 1

Why is the Claude count only approximate?

Accepted Answer

Anthropic doesn't publish their tokenizer. The only exact count comes from calling the countTokens API, which needs a network roundtrip. For a browser-side counter, a character-based heuristic is the practical choice: it's accurate to about ±10% for English prose.

Question 2

How accurate is the chars÷4 heuristic?

Accepted Answer

Very accurate for English prose (±10%), less accurate for code (it slightly under-counts), and much less accurate for non-English scripts. For Chinese, Japanese, and Korean, Claude typically produces 2-3× more tokens per character than English.

Question 3

Does Opus 4.7 really tokenize differently?

Accepted Answer

Yes. Anthropic shipped a retrained tokenizer with Opus 4.7 that emits 1.0-1.35× more tokens than Opus 4.6 on identical input. We apply a 1.15× multiplier to Opus 4.7 cost estimates elsewhere in Calcis to reflect this. This counter doesn't: assume your real bill with Opus 4.7 will be roughly 15% higher than the count shown here.

Question 4

Should I use the @anthropic-ai/tokenizer npm package?

Accepted Answer

No. Anthropic explicitly recommends against it. That package is a Claude-2-era BPE vocabulary that doesn't match any modern Claude model. A simple char-based heuristic is actually more accurate than the legacy tokenizer for Claude 3+.

Question 5

How do I get an exact Claude token count?

Accepted Answer

Call Anthropic's messages.countTokens endpoint with the model ID. It's free and returns the exact count you'll be billed for. The Calcis /estimator does this automatically when you paste a prompt.

Text	Characters	~Tokens
Hello	5	1
Hello, world!	13	4
The quick brown fox jumps over the lazy dog.	44	11
function add(a, b) { return a + b; }	36	13
pneumonoultramicroscopicsilicovolcanoconiosis	45	11
🎉🚀✨ emoji counts are surprising	30	12

Claude token counter

What is a token?

How tokens relate to characters

Frequently asked

Know the tokens? Get the cost.