Anthropic

Claude Sonnet 4.6 pricing

Anthropic's mid-tier model. The default for production code, agentic work, and anywhere you need long-context reasoning without Opus pricing.

Input

$3.00/ 1M tok

Output

$15.00/ 1M tok

Context window
1M
Max output
64K
Cached input
-
Verified
2026-04-06

Sonnet 4.6 is the model most teams reach for when they graduate past Haiku but can't justify Opus on every request. At $3 input / $15 output per 1M tokens, it's 5x cheaper than Opus 4.6 while delivering frontier-class performance on reasoning, code generation, and tool use. The 1M-token context window matches Opus, so you don't give up document-scale work to save money.

Sonnet is the workhorse model behind a lot of agentic workflows - Claude Code, Cursor, and most enterprise assistant deployments default to it. The 64K max output covers any realistic single response, and the per-token rates make multi-turn conversations affordable at scale. Cache writes do cost extra ($3.75 per 1M for the cache write side; reads are 90% off the input rate), so design for stable system prompts to maximize the discount.

Calcis counts Sonnet input tokens via Anthropic's free countTokens API (rate-limited but exact) and predicts output length using a regression trained on real (prompt, response) pairs. Pre-flight your spend before shipping a feature to production.

Estimate your cost on Claude Sonnet 4.6

Paste your prompt into the estimator, pick Claude Sonnet 4.6, and see the exact dollar cost - input tokens counted with the provider's own tokenizer, output tokens predicted by our regression model.

Frequently asked

How much does Claude Sonnet 4.6 cost per call?
A 1,000-token prompt with a typical 500-token response costs about $0.0105 ($0.003 input + $0.0075 output). For a more accurate estimate on your prompt, run it through Calcis.
Is Sonnet 4.6 worth 5x the price of Haiku 4.5?
For routine chat, summarization, and routing tasks - usually no. For agentic workflows, code generation, and anything that benefits from chain-of-thought - usually yes. Sonnet's reasoning quality scales noticeably better on hard prompts.
What is prompt caching on Claude?
Cached input reads cost 10% of the standard input rate. The first call writes the cache (priced higher than standard input), then subsequent calls within 5 minutes that share the same prefix read from cache. Best for stable system prompts and few-shot examples.
Does Sonnet 4.6 support a 1M-token context window?
Yes - the full 1M tokens is available on the standard API tier, with a 64K maximum output per response. Long-context inputs do bill at the same rate (no surcharge tier as on Gemini).

Pricing verified 2026-04-06 from the provider's rate card.