Question 1

How much does Claude Opus 4.8 cost per request?

Accepted Answer

At the standard rate a 1,000-token prompt with a typical 500-token reply runs about $0.0175 ($0.005 input + $0.0125 output). Fast mode doubles that to about $0.035 for the same request.

Question 2

Is Opus 4.8 more expensive than Opus 4.7?

Accepted Answer

No. The standard rate card is identical at $5/$25 per 1M tokens. The only price change is the new optional Fast mode at $10/$50, which you opt into per request.

Question 3

What is Fast mode?

Accepted Answer

Fast mode runs Opus 4.8 at roughly 2.5x the standard throughput for latency-sensitive workloads. It bills at 2x the standard per-token rate ($10 input / $50 output per 1M).

Question 4

What is the context window for Claude Opus 4.8?

Accepted Answer

1M tokens by default, with up to 128K output tokens per response - the same envelope as Opus 4.7.

Question 5

Does Opus 4.8 still use effort levels?

Accepted Answer

Opus 4.8 moves to adaptive thinking: the model allocates reasoning per turn rather than exposing a fixed low/medium/high dial. You can still cap output length to bound cost.

Tier	Input / 1M	Output / 1M	Cached in / 1M
Standard Default latency	$5.00	$25.00	-
Fast mode ~2.5x throughput	$10.00	$50.00	-

Claude Opus 4.8 pricing

Pricing modes

Estimate LLM costs before you send

Frequently asked