Methodology
How we estimate consumer-tier session quotas
The subscription-quota panel on /session predicts how much of each consumer chat plan a simulated session would consume. Numbers come from the formula below, calibrated against primary-source provider documentation and community measurement. This page is the audit trail.
Last verified 25 April 2026.
Why this is hard
None of the three major providers publish a clean “X messages per month” number. Anthropic states explicitly that its limits are tied to compute usage rather than raw message volume and that the number you can send varies with message length, attached files, conversation history, tool usage, model choice, and artefacts. OpenAI's help centre uses the phrase “may vary based on system conditions” and ships multiple stacked counters across 3-hour, weekly, and monthly windows. Google publishes the cleanest integer caps for Gemini but explicitly reserves the right to flex them under capacity pressure.
So Calcis cannot give you a single guaranteed number. What we can do is give you a defensible band that combines published policy, community measurement, and time-of-day throttling effects.
The formula
For each (plan, model) combination we identify the applicable counters — for example Anthropic Pro has a 5-hour rolling token bucket, a 7-day Sonnet/Haiku message cap, and a separate weekly Opus cap. For each counter we compute a low / typical / high share of the bucket the simulated session consumes:
base_usage = (turns or estimated session tokens) × model_multiplier peak_multiplier = 1.0 (off-peak) or provider-specific (peak) share_low = base_usage × peak.low / bucket.high share_typical = base_usage × peak.typical / bucket.typical share_high = base_usage × peak.high / bucket.low dominant_counter = argmax(share_typical) across applicable counters
For token-based counters (Anthropic 5-hour session) we model conversation history compounding: turn N input cost ≈ N × per-message input tokens, because the entire conversation re-sends each turn. This is why Pro's 5-hour session bucket exhausts after ~20 substantive turns rather than 200 short ones — empirically validated against IntuitionLabs measurement showing that by message 206 a user's actual prompt was 1.3% of the ~118,000 tokens being processed.
Model multipliers reflect that Anthropic Opus drains roughly 5× faster than Sonnet per equivalent prompt, and ChatGPT's mini and nano variants drain a fraction of an Instant message slot.
Bucket sizes by plan
The numbers below are what we encode in lib/subscription-quotas.ts. Each carries a primary-source URL and a verified-at date; the canonical machine-readable list lives in the source file.
Claude
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 5-hour session Rolling 5 hours from your first message. | 8,000 | 12,000 | 20,000 | tokens | link |
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 5-hour session Rolling 5 hours from your first message. | 35,000 | 44,000 | 55,000 | tokens | link |
| Weekly (Sonnet/Haiku) Rolling 7 days from your first usage in the cycle. | 225 | 600 | 1,100 | msgs | link |
| Weekly Opus Pro Opus access is heavily limited. | 20 | 50 | 100 | msgs | link |
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 5-hour session Rolling 5 hours from your first message. | 70,000 | 88,000 | 110,000 | tokens | link |
| Weekly (Sonnet/Haiku) Rolling 7 days from your first usage in the cycle. | 1,100 | 3,000 | 5,000 | msgs | link |
| Weekly Opus Separate 7-day Opus sub-cap (visible in Settings → Usage). | 200 | 500 | 900 | msgs | link |
ChatGPT
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 3-hour Instant Rolling 3 hours; after exhaustion, auto-routes to nano. | 25 | 35 | 50 | msgs | link |
| 5-hour Thinking Rolling 5 hours when Thinking is manually selected. | 6 | 10 | 15 | msgs | link |
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 3-hour Instant Rolling 3 hours for GPT-5 Instant + GPT-4 family. | 80 | 160 | 240 | msgs | link |
| Weekly Thinking 3,000 manually-selected Thinking msgs/week. Auto-routed Thinking is exempt. | 2,000 | 3,000 | 4,000 | msgs | link |
| Weekly reasoning (o3 / o4-mini) o3 and o4-mini share a separate weekly cap. | 30 | 50 | 100 | msgs | link |
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| 3-hour Instant OpenAI describes Pro as near-unlimited; modeled as a large but finite bucket so percentages stay meaningful. | 800 | 1,500 | 3,000 | msgs | link |
| Weekly Thinking GPT-5 Thinking weekly cap remains 3,000 even on Pro. | 2,000 | 3,000 | 4,000 | msgs | link |
| Weekly reasoning (o3 / o4-mini) Pro retains a higher reasoning cap than Plus. | 200 | 250 | 400 | msgs | link |
Gemini
| Counter | Low | Typical | High | Unit | Source |
|---|---|---|---|---|---|
| Daily Pro Resets at midnight Pacific; 5 Pro prompts/day. | 3 | 5 | 8 | msgs | link |
| Daily Flash Resets at midnight Pacific. | 50 | 100 | 200 | msgs | link |
Peak-hour windows
Anthropic confirmed in March 2026 that during weekday US business hours (05:00–11:00 PT, ≈ 13:00–19:00 UTC) Claude users move through their 5-hour session limits faster than off-peak — about 7% of users hit limits they previously would not. Community estimates put the multiplier at roughly 2×. The effect is on the 5-hour session counter only; weekly caps are unaffected.
OpenAI's File Uploads FAQ states caps may be lowered during peak hours without specifying a window. We model US business hours (13:00–22:00 UTC weekdays) at a softer 1.3× multiplier since the public confirmation is weaker.
Google publishes no peak window. The Gemini help text only notes that Free users may be throttled before paid users when capacity is constrained.
For Australian and APAC users specifically: Anthropic peak (13:00–19:00 UTC) maps to 23:00–05:00 AEDT or 00:00–06:00 AEST. The Australian working day is naturally off-peak — you should see ~2× more headroom on Claude's 5-hour session counter than US-based users during their working hours.
What we don't know
The honest list of limitations.
- Tier bucket sizes for Claude. Community P90 estimates (~44k / 88k / 220k tokens per 5-hour window for Pro / Max 5× / Max 20×) come from Claude-Code-Usage-Monitor and have not been confirmed by Anthropic.
- Peak-hour magnitude. Anthropic communicates the effect qualitatively. The 2× multiplier is a community estimate; the true figure may be lower or higher and may shift with policy changes.
- Silent policy changes. GitHub issue #9094 documents an unannounced limit reduction in late September 2025 affecting roughly 30 reporting users. Providers re-tune limits without notice; we re-verify buckets against help-centre pages but there is always lag.
- Hidden context. System prompts, project knowledge, memory features, automatic compaction and CLAUDE.md/instruction files all add tokens the user never sees. The simulator passes turn count only; real sessions with attached PDFs or large project context can land outside the band.
- Tokeniser drift. Simon Willison measured a 1.46× token inflation on the Opus 4.7 system prompt vs Opus 4.6 with no API-pricing change, so the same prompt costs more on a newer model version.
- Tool-call recursion. A single user prompt in agentic mode can spawn dozens of internal tool calls; the simulator does not yet model this and the band will under-estimate for agentic workloads.
When in doubt, treat the typical share as a midpoint and the high share as a worst-case for capacity planning. The band exists because the underlying numbers are uncertain — please do not treat any single percentage as a guarantee.
Update cadence
Bucket sizes and peak windows are re-checked manually against provider help centres on a rolling cadence. Each counter carries its own verifiedAt date. When a provider announces a policy change, we update the relevant counter and bump its date. Material changes land in the release notes.