Question 1

Is the 1.33 ratio the same for every model?

Accepted Answer

Close enough for estimation. OpenAI's o200k_base is slightly more efficient than cl100k_base (smaller ratio), Claude's tokenizer is broadly similar to o200k, and Gemini's SentencePiece runs a hair higher. For back-of-envelope estimation, use 1.33; for precision, use our token counter tools.

Question 2

Why is the ratio higher for code?

Accepted Answer

Code has lots of punctuation, indentation, and variable names that don't appear in the tokenizer's main vocabulary. "function" is usually one token, but "useState" or "__proto__" often splits into 2-3. Expect 2-2.5 tokens per "word" for typical JavaScript/Python code.

Question 3

How many tokens are in a typical email?

Accepted Answer

A 200-word email is about 265 tokens. A longer support ticket or well-structured business email at 400 words is about 530 tokens. For costing, assume 300 tokens as a reasonable default for business email.

Question 4

How should I count tokens for non-English text?

Accepted Answer

Most tokenizers were trained primarily on English, so non-English text runs higher. For Spanish/French/German expect ~1.8 tokens per word; for CJK (Chinese/Japanese/Korean), count characters instead of words: the ratio is roughly 1 character ≈ 1-2 tokens.

Question 5

Does this count include the system prompt?

Accepted Answer

No: this converts words you paste to an approximate token count. In a real API call, your system prompt, conversation history, and any tool schemas all count toward input tokens. For production cost estimation, multiply this by the number of messages or use the Calcis /estimator.

1 sentence (~15 words)	~20 tokens
1 paragraph (~100 words)	~133 tokens
1 page (~500 words)	~665 tokens
1 blog post (~1,000 words)	~1,330 tokens
1 short document (~3,000 words)	~4,000 tokens
1 book chapter (~10,000 words)	~13,300 tokens
1 novel (~80,000 words)	~106,400 tokens

Words to tokens converter

How the conversion works

Common word counts

Frequently asked

Ready to estimate a real prompt?