Anthropic
NewClaude Opus 4.8 pricing
Anthropic's May 2026 flagship. Same $5/$25 standard rate card as Opus 4.7, now with adaptive thinking and an optional Fast mode that runs ~2.5x faster for double the price.
Input
$5.00/ 1M tok
Output
$25.00/ 1M tok
- Context window
- 1M
- Max output
- 128K
- Cached input
- -
- Verified
- 2026-07-01
Pricing modes
| Tier | Input / 1M | Output / 1M | Cached in / 1M |
|---|---|---|---|
Standard Default latency | $5.00 | $25.00 | - |
Fast mode ~2.5x throughput | $10.00 | $50.00 | - |
Fast mode is a latency tier, not a different model. It bills at 2x the standard per-token rate in exchange for roughly 2.5x the tokens-per-second.
Claude Opus 4.8 is Anthropic's flagship model, released on 28 May 2026. It keeps the same headline rate card as Opus 4.7 – $5 per 1M input tokens and $25 per 1M output tokens – while adding adaptive thinking (the model decides how much to reason per turn rather than exposing a fixed effort dial) and support for mid-conversation system messages without a beta header.
The headline change for cost-sensitive teams is Fast mode. For latency-critical workloads Opus 4.8 can run at roughly 2.5x the standard tokens-per-second, billed at $10 input / $50 output per 1M tokens. That is double the standard rate, so reach for it only when time-to-first-token genuinely matters; for batch and background work the standard tier is the right default.
Context is 1M tokens with up to 128K output, unchanged from Opus 4.7. If you are moving up from 4.7, expect similar per-token economics with a modest quality lift; the main decision is whether any of your traffic justifies Fast mode.
Estimate LLM costs before you send
Paste your prompt into the Calcis estimator to see token counts and per-request cost across every tracked model, then compare Claude Opus 4.8 against them side by side.
Frequently asked
- How much does Claude Opus 4.8 cost per request?
- At the standard rate a 1,000-token prompt with a typical 500-token reply runs about $0.0175 ($0.005 input + $0.0125 output). Fast mode doubles that to about $0.035 for the same request.
- Is Opus 4.8 more expensive than Opus 4.7?
- No. The standard rate card is identical at $5/$25 per 1M tokens. The only price change is the new optional Fast mode at $10/$50, which you opt into per request.
- What is Fast mode?
- Fast mode runs Opus 4.8 at roughly 2.5x the standard throughput for latency-sensitive workloads. It bills at 2x the standard per-token rate ($10 input / $50 output per 1M).
- What is the context window for Claude Opus 4.8?
- 1M tokens by default, with up to 128K output tokens per response - the same envelope as Opus 4.7.
- Does Opus 4.8 still use effort levels?
- Opus 4.8 moves to adaptive thinking: the model allocates reasoning per turn rather than exposing a fixed low/medium/high dial. You can still cap output length to bound cost.
Pricing verified 2026-07-01 from the provider's rate card. These figures are informational and not yet wired into the Calcis estimator or billing.