Methodology

How we estimate consumer-tier session quotas

The subscription-quota panel on /session predicts how much of each consumer chat plan a simulated session would consume. Numbers come from the formula below, calibrated against primary-source provider documentation and community measurement. This page is the audit trail.

Last verified 25 April 2026.

Why this is hard

None of the three major providers publish a clean “X messages per month” number. Anthropic states explicitly that its limits are tied to compute usage rather than raw message volume and that the number you can send varies with message length, attached files, conversation history, tool usage, model choice, and artefacts. OpenAI's help centre uses the phrase “may vary based on system conditions” and ships multiple stacked counters across 3-hour, weekly, and monthly windows. Google publishes the cleanest integer caps for Gemini but explicitly reserves the right to flex them under capacity pressure.

So Calcis cannot give you a single guaranteed number. What we can do is give you a defensible band that combines published policy, community measurement, and time-of-day throttling effects.

The formula

For each (plan, model) combination we identify the applicable counters — for example Anthropic Pro has a 5-hour rolling token bucket, a 7-day Sonnet/Haiku message cap, and a separate weekly Opus cap. For each counter we compute a low / typical / high share of the bucket the simulated session consumes:

base_usage      = (turns or estimated session tokens) × model_multiplier
peak_multiplier = 1.0 (off-peak) or provider-specific (peak)
share_low       = base_usage × peak.low  / bucket.high
share_typical   = base_usage × peak.typical / bucket.typical
share_high      = base_usage × peak.high / bucket.low

dominant_counter = argmax(share_typical) across applicable counters

For token-based counters (Anthropic 5-hour session) we model conversation history compounding: turn N input cost ≈ N × per-message input tokens, because the entire conversation re-sends each turn. This is why Pro's 5-hour session bucket exhausts after ~20 substantive turns rather than 200 short ones — empirically validated against IntuitionLabs measurement showing that by message 206 a user's actual prompt was 1.3% of the ~118,000 tokens being processed.

Model multipliers reflect that Anthropic Opus drains roughly 5× faster than Sonnet per equivalent prompt, and ChatGPT's mini and nano variants drain a fraction of an Instant message slot.

Bucket sizes by plan

The numbers below are what we encode in lib/subscription-quotas.ts. Each carries a primary-source URL and a verified-at date; the canonical machine-readable list lives in the source file.

Claude

Claude FreeFree
CounterLowTypicalHighUnitSource
5-hour session
Rolling 5 hours from your first message.
8,00012,00020,000tokenslink
Claude Pro$20/mo
CounterLowTypicalHighUnitSource
5-hour session
Rolling 5 hours from your first message.
35,00044,00055,000tokenslink
Weekly (Sonnet/Haiku)
Rolling 7 days from your first usage in the cycle.
2256001,100msgslink
Weekly Opus
Pro Opus access is heavily limited.
2050100msgslink
Claude Max 5×$100/mo
CounterLowTypicalHighUnitSource
5-hour session
Rolling 5 hours from your first message.
70,00088,000110,000tokenslink
Weekly (Sonnet/Haiku)
Rolling 7 days from your first usage in the cycle.
1,1003,0005,000msgslink
Weekly Opus
Separate 7-day Opus sub-cap (visible in Settings → Usage).
200500900msgslink
Claude Max 20×$200/mo
CounterLowTypicalHighUnitSource
5-hour session
Rolling 5 hours from your first message.
180,000220,000280,000tokenslink
Weekly (Sonnet/Haiku)
Rolling 7 days from your first usage in the cycle.
5,00012,00020,000msgslink
Weekly Opus
Separate 7-day Opus sub-cap.
6001,5002,500msgslink

ChatGPT

ChatGPT FreeFree
CounterLowTypicalHighUnitSource
3-hour Instant
Rolling 3 hours; after exhaustion, auto-routes to nano.
253550msgslink
5-hour Thinking
Rolling 5 hours when Thinking is manually selected.
61015msgslink
ChatGPT Plus$20/mo
CounterLowTypicalHighUnitSource
3-hour Instant
Rolling 3 hours for GPT-5 Instant + GPT-4 family.
80160240msgslink
Weekly Thinking
3,000 manually-selected Thinking msgs/week. Auto-routed Thinking is exempt.
2,0003,0004,000msgslink
Weekly reasoning (o3 / o4-mini)
o3 and o4-mini share a separate weekly cap.
3050100msgslink
ChatGPT Pro$200/mo
CounterLowTypicalHighUnitSource
3-hour Instant
OpenAI describes Pro as near-unlimited; modeled as a large but finite bucket so percentages stay meaningful.
8001,5003,000msgslink
Weekly Thinking
GPT-5 Thinking weekly cap remains 3,000 even on Pro.
2,0003,0004,000msgslink
Weekly reasoning (o3 / o4-mini)
Pro retains a higher reasoning cap than Plus.
200250400msgslink

Gemini

Gemini FreeFree
CounterLowTypicalHighUnitSource
Daily Pro
Resets at midnight Pacific; 5 Pro prompts/day.
358msgslink
Daily Flash
Resets at midnight Pacific.
50100200msgslink
Google AI Pro$20/mo
CounterLowTypicalHighUnitSource
Daily Pro
100 Pro prompts/day. Resets at midnight Pacific.
60100150msgslink
Daily Flash
Soft daily Flash cap. Heavily distributed across the day.
5001,0002,000msgslink
Google AI Ultra$250/mo
CounterLowTypicalHighUnitSource
Daily Pro
500 Pro prompts/day. Resets at midnight Pacific.
300500800msgslink
Daily Flash
Highest published Flash allowance.
2,0005,00010,000msgslink

Peak-hour windows

Anthropic confirmed in March 2026 that during weekday US business hours (05:00–11:00 PT, ≈ 13:00–19:00 UTC) Claude users move through their 5-hour session limits faster than off-peak — about 7% of users hit limits they previously would not. Community estimates put the multiplier at roughly 2×. The effect is on the 5-hour session counter only; weekly caps are unaffected.

OpenAI's File Uploads FAQ states caps may be lowered during peak hours without specifying a window. We model US business hours (13:00–22:00 UTC weekdays) at a softer 1.3× multiplier since the public confirmation is weaker.

Google publishes no peak window. The Gemini help text only notes that Free users may be throttled before paid users when capacity is constrained.

For Australian and APAC users specifically: Anthropic peak (13:00–19:00 UTC) maps to 23:00–05:00 AEDT or 00:00–06:00 AEST. The Australian working day is naturally off-peak — you should see ~2× more headroom on Claude's 5-hour session counter than US-based users during their working hours.

What we don't know

The honest list of limitations.

  • Tier bucket sizes for Claude. Community P90 estimates (~44k / 88k / 220k tokens per 5-hour window for Pro / Max 5× / Max 20×) come from Claude-Code-Usage-Monitor and have not been confirmed by Anthropic.
  • Peak-hour magnitude. Anthropic communicates the effect qualitatively. The 2× multiplier is a community estimate; the true figure may be lower or higher and may shift with policy changes.
  • Silent policy changes. GitHub issue #9094 documents an unannounced limit reduction in late September 2025 affecting roughly 30 reporting users. Providers re-tune limits without notice; we re-verify buckets against help-centre pages but there is always lag.
  • Hidden context. System prompts, project knowledge, memory features, automatic compaction and CLAUDE.md/instruction files all add tokens the user never sees. The simulator passes turn count only; real sessions with attached PDFs or large project context can land outside the band.
  • Tokeniser drift. Simon Willison measured a 1.46× token inflation on the Opus 4.7 system prompt vs Opus 4.6 with no API-pricing change, so the same prompt costs more on a newer model version.
  • Tool-call recursion. A single user prompt in agentic mode can spawn dozens of internal tool calls; the simulator does not yet model this and the band will under-estimate for agentic workloads.

When in doubt, treat the typical share as a midpoint and the high share as a worst-case for capacity planning. The band exists because the underlying numbers are uncertain — please do not treat any single percentage as a guarantee.

Update cadence

Bucket sizes and peak windows are re-checked manually against provider help centres on a rolling cadence. Each counter carries its own verifiedAt date. When a provider announces a policy change, we update the relevant counter and bump its date. Material changes land in the release notes.

Spot a stale source or a bucket size you can verify against your own usage data? Tell us — methodology improvements ship in the open.