Changelog

Pricing changelog

Name: Calcis LLM Pricing Changelog
Creator: Calcis
License: https://opensource.org/licenses/MIT

Every pricing change, logged. Updated within hours of provider announcements.

All models →RSS feed

April 18, 2026
Q2 2026 pricing report published
Full Q2 2026 landscape report with 25-model comparison, head-to-head flagship scenarios, and a machine-readable CSV distribution.
- Published the Q2 2026 LLM Pricing Report at /reports/llm-pricing-q2-2026
- Full 25-model landscape with head-to-head flagship scenarios and five workload cost calculations
- Machine-readable dataset available at /data/llm-pricing-q2-2026.csv (RFC 4180, columns match ModelPricing)
- Dataset + Report JSON-LD so search engines can surface the CSV distribution
Source
April 17, 2026
Claude Opus 4.7 added
Anthropic shipped Claude Opus 4.7 at $5/$25 per 1M tokens with a retrained tokenizer that emits 1.0–1.35× more tokens per unit of text.
- Added Claude Opus 4.7 ($5/$25 per 1M tokens, 1M context)
- Added tokenizer multiplier (1.15×) for Opus 4.7 new tokenizer
- Added xhigh effort level for Anthropic models
- Added effort level cost modelling for all Anthropic and OpenAI models
Source
April 16, 2026
Model expansion + pricing updates
Ingested the full 2025–2026 model catalogue: GPT-5 and GPT-5.4 families, o3/o4-mini, Gemini 3.1 previews, and the remaining current Anthropic tiers.
- Added GPT-5.4, GPT-5.4 mini, GPT-5.4 nano
- Added GPT-5, GPT-5 mini, GPT-5 nano
- Added GPT-4.1, GPT-4.1 mini
- Added o3, o4-mini
- Added Gemini 3.1 Pro, Gemini 3 Flash, Gemini 3.1 Flash-Lite (preview)
- Added Gemini 2.5 Flash-Lite
- Updated all Anthropic models to current pricing
Source
April 1, 2026
Initial pricing database
Calcis launched with an initial 24-model pricing database spanning OpenAI, Anthropic, and Google.
- Launched with 24 models across OpenAI, Anthropic, and Google
- Claude Opus 4.6, Sonnet 4.6, Haiku 4.5
- GPT-4o, GPT-4o mini
- Gemini 2.5 Pro, Gemini 2.5 Flash
Source
March 20, 2026Backfilled
Claude Opus 4.6 released
Anthropic shipped Claude Opus 4.6 as the flagship tier at $5/$25 per 1M tokens with a 1M-token context window.
- Claude Opus 4.6 released at $5/$25 per 1M tokens
- 1M-token context window with 128K max output
Source
February 15, 2026Backfilled
OpenAI GPT-5.4 family launch
OpenAI released GPT-5.4, GPT-5.4 mini, and GPT-5.4 nano with the same token window as GPT-5 but lower blended cost on typical workloads.
- GPT-5.4 shipped at $2.50/$15.00 per 1M tokens, 1.05M context
- GPT-5.4 mini at $0.75/$4.50 per 1M tokens
- GPT-5.4 nano at $0.20/$1.25 per 1M tokens
Source
February 4, 2026Backfilled
Gemini 3 Flash (preview)
Google launched Gemini 3 Flash in preview at $0.50/$3.00 per 1M tokens with a 1M-token context.
- Gemini 3 Flash preview at $0.50/$3.00 per 1M tokens
- Cached input priced at $0.05 per 1M tokens
Source
January 22, 2026Backfilled
Gemini 3.1 Pro (preview)
Google introduced Gemini 3.1 Pro in preview with tiered long-context pricing above the 200K-token threshold.
- Gemini 3.1 Pro preview at $2.00/$12.00 per 1M tokens
- Long-context tier (>200K tokens): $4.00/$18.00 per 1M
- Cached input at $0.20 per 1M tokens
Source
January 15, 2026Backfilled
Gemini 3.1 Flash-Lite (preview)
Google shipped Gemini 3.1 Flash-Lite in preview as the lowest-tier model in the 3.x family at $0.25/$1.50 per 1M tokens.
- Gemini 3.1 Flash-Lite preview at $0.25/$1.50 per 1M tokens
- 1M-token context with aggressive cache pricing at $0.025/1M
Source
December 15, 2025Backfilled
Claude Sonnet 4.6 released
Anthropic released Claude Sonnet 4.6 at $3/$15 per 1M tokens with a 1M-token context window for standard tier users.
- Sonnet 4.6 released at $3/$15 per 1M tokens
- Context window extended to 1M tokens
Source
November 24, 2025Backfilled
Claude Opus 4.5 released
Anthropic released Claude Opus 4.5 at $5/$25 per 1M tokens, a steep cut from Opus 4.1's $15/$75 rate.
- Opus 4.5 at $5/$25 per 1M tokens (down from $15/$75 for 4.1)
- 200K-token context, 64K max output
Source
October 15, 2025Backfilled
Claude Haiku 4.5 released
Anthropic refreshed the Haiku tier at $1/$5 per 1M tokens with a 200K context window.
- Haiku 4.5 at $1/$5 per 1M tokens
- 200K-token context, 64K max output
Source
September 29, 2025Backfilled
Claude Sonnet 4.5 released
Anthropic released Claude Sonnet 4.5 at $3/$15 per 1M tokens.
- Sonnet 4.5 at $3/$15 per 1M tokens
- 200K-token context, 64K max output
Source
September 15, 2025Backfilled
Gemini 2.5 Flash-Lite GA
Google graduated Gemini 2.5 Flash-Lite to general availability at $0.10/$0.40 per 1M tokens, the cheapest tier in the 2.5 line.
- Gemini 2.5 Flash-Lite GA at $0.10/$0.40 per 1M tokens
- Cached input at $0.01 per 1M tokens
Source
August 7, 2025Backfilled
OpenAI GPT-5 family launch
OpenAI launched GPT-5, GPT-5 mini, and GPT-5 nano as the post-GPT-4.1 flagship line.
- GPT-5 at $1.25/$10.00 per 1M tokens, 400K context
- GPT-5 mini at $0.25/$2.00 per 1M tokens
- GPT-5 nano at $0.05/$0.40 per 1M tokens
- Automatic cached-input discount: 10× off input rate
Source
August 5, 2025Backfilled
Claude Opus 4.1 released
Anthropic released Claude Opus 4.1 at $15/$75 per 1M tokens, the last entry in the 4.x top-tier pricing before the 4.5 reset.
- Opus 4.1 at $15/$75 per 1M tokens
- 200K context, 32K max output
Source
June 17, 2025Backfilled
Gemini 2.5 Flash GA pricing
Google finalized Gemini 2.5 Flash GA pricing at $0.30/$2.50 per 1M tokens with a 1M-token context.
- Gemini 2.5 Flash GA at $0.30/$2.50 per 1M tokens
- 1M-token context window
Source
May 22, 2025Backfilled
Claude 4: Opus 4 and Sonnet 4 launch
Anthropic introduced the Claude 4 family with Opus 4 and Sonnet 4, a hybrid reasoning architecture across both tiers.
- Claude Opus 4 and Sonnet 4 launched
- First Claude family with unified hybrid reasoning across tiers
- Initial Opus 4 at $15/$75; Sonnet 4 at $3/$15
Source
April 16, 2025Backfilled
OpenAI o3 and o4-mini launch
OpenAI released o3 and o4-mini as the next generation of reasoning models following the o1/o3-mini lineage.
- o3 at $2.00/$8.00 per 1M tokens, 200K context, 100K max output
- o4-mini at $1.10/$4.40 per 1M tokens
- Both priced with cached-input 4× discount
Source
April 14, 2025Backfilled
OpenAI GPT-4.1 family launch
OpenAI launched GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano with an expanded 1M-token context window.
- GPT-4.1 at $2.00/$8.00 per 1M tokens, ~1M context
- GPT-4.1 mini at $0.40/$1.60 per 1M tokens
- First OpenAI flagship at 1M-token context
Source
March 25, 2025Backfilled
Gemini 2.5 Pro launch
Google launched Gemini 2.5 Pro with a 2M-token context window and tiered pricing above 200K tokens.
- Gemini 2.5 Pro at $1.25/$10.00 per 1M tokens (standard tier)
- Long-context tier (>200K): $2.50/$15.00 per 1M tokens
- 2M-token context window
Source
February 24, 2025Backfilled
Claude 3.7 Sonnet with extended thinking
Anthropic released Claude 3.7 Sonnet, the first Claude model to expose extended-thinking as a user-controllable parameter.
- Claude 3.7 Sonnet at $3/$15 per 1M tokens
- Introduced extended-thinking budget parameter
- 200K context, 64K max output
Source
February 5, 2025Backfilled
Gemini 2.0 Pro Experimental
Google exposed Gemini 2.0 Pro Experimental via AI Studio ahead of the 2.5 line.
- Gemini 2.0 Pro Experimental available in AI Studio
- Free tier during experimental window
Source
January 31, 2025Backfilled
OpenAI o3-mini launch
OpenAI released o3-mini as the second-generation reasoning model, replacing o1-mini across most reasoning benchmarks.
- o3-mini launched with reasoning-effort parameter
- Pricing parity with o1-mini at launch
Source
December 17, 2024Backfilled
OpenAI o1 full release
OpenAI released the full o1 model (graduating from o1-preview) and introduced o1-pro mode in ChatGPT.
- o1 full release at $15/$60 per 1M tokens
- o1-pro mode available in ChatGPT Pro
- Reasoning-effort parameter added to the API
Source
December 11, 2024Backfilled
Gemini 2.0 Flash launch
Google launched Gemini 2.0 Flash as the first of the Gemini 2.0 line, with native tool use and multimodal live streaming.
- Gemini 2.0 Flash launched with native tool use
- Multimodal live streaming API
- Free tier in AI Studio during experimental window
Source
November 4, 2024Backfilled
Claude 3.5 Haiku launch
Anthropic launched Claude 3.5 Haiku at $1/$5 per 1M tokens, a 4× price increase over 3 Haiku but with substantially stronger benchmarks.
- Claude 3.5 Haiku at $1/$5 per 1M tokens
- 200K context, 8K max output at launch
- Price increase vs 3 Haiku ($0.25/$1.25) drew community pushback
Source
October 22, 2024Backfilled
Claude 3.5 Sonnet v2 + computer use
Anthropic refreshed Claude 3.5 Sonnet and introduced the computer use beta API for agent workflows.
- Claude 3.5 Sonnet v2 ('claude-3-5-sonnet-20241022')
- Computer use beta API launched
- Pricing held at $3/$15 per 1M tokens
Source
October 8, 2024Backfilled
Anthropic Message Batches API
Anthropic launched the Message Batches API with a 50% discount for async workloads completed within 24 hours.
- Message Batches API at 50% discount
- 24-hour turnaround window
Source
October 1, 2024Backfilled
OpenAI prompt caching
OpenAI introduced automatic prompt caching with a 50% input-token discount on cached prefixes (≥1024 tokens).
- Automatic prompt caching for prefixes ≥1024 tokens
- 50% discount on cached input tokens
- Applied to GPT-4o, o1 family, and GPT-4o mini at launch
Source
September 24, 2024Backfilled
Gemini 1.5 Pro 64% price cut
Google cut Gemini 1.5 Pro input/output pricing by 64% across the standard (<128K) tier.
- Gemini 1.5 Pro standard-tier: $1.25/$5.00 per 1M tokens (down from $3.50/$10.50)
- Long-context tier >128K: $2.50/$10.00 per 1M tokens
- 64% reduction on the mainline tier
Source
September 12, 2024Backfilled
OpenAI o1-preview + o1-mini launch
OpenAI introduced the o1 family of reasoning models with chain-of-thought natively priced into output tokens.
- o1-preview at $15/$60 per 1M tokens
- o1-mini at $3/$12 per 1M tokens
- Reasoning tokens counted against the output meter
Source
August 14, 2024Backfilled
Anthropic prompt caching (beta)
Anthropic launched prompt caching in beta with tiered cache-write / cache-read pricing.
- Cache writes at 1.25× base input rate (5-minute TTL)
- Cache reads at 0.1× base input rate
- Initially limited to 3.5 Sonnet and 3 Haiku
Source
August 6, 2024Backfilled
GPT-4o price cut
OpenAI cut GPT-4o input/output pricing from $5/$15 to $2.50/$10.00 per 1M tokens.
- GPT-4o input: $5.00 → $2.50 per 1M tokens (50% cut)
- GPT-4o output: $15.00 → $10.00 per 1M tokens (33% cut)
Source
July 18, 2024Backfilled
GPT-4o mini launch
OpenAI launched GPT-4o mini at $0.15/$0.60 per 1M tokens, replacing GPT-3.5 Turbo as the entry-tier model.
- GPT-4o mini at $0.15/$0.60 per 1M tokens
- 128K context, 16K max output
- Positioned as the GPT-3.5 Turbo replacement
Source
June 20, 2024Backfilled
Claude 3.5 Sonnet launch
Anthropic released Claude 3.5 Sonnet at Sonnet-tier pricing but with benchmarks above 3 Opus, the first 'mid-tier beats top-tier' Claude release.
- Claude 3.5 Sonnet at $3/$15 per 1M tokens
- 200K context, 8K max output at launch
- Benchmark parity with or above Claude 3 Opus
Source
May 14, 2024Backfilled
Gemini 1.5 Flash launch
Google launched Gemini 1.5 Flash at I/O 2024 as the low-latency sibling of 1.5 Pro, with a 1M-token context.
- Gemini 1.5 Flash at $0.35/$1.05 per 1M tokens at launch
- 1M-token context window
- Tiered pricing introduced above 128K tokens
Source
May 13, 2024Backfilled
GPT-4o launch
OpenAI launched GPT-4o at $5/$15 per 1M tokens, half the price of GPT-4 Turbo with native multimodality.
- GPT-4o at $5/$15 per 1M tokens
- 128K context, 4K max output at launch
- Native audio + vision + text multimodality
Source
April 15, 2024Backfilled
OpenAI Batch API
OpenAI introduced the Batch API with a 50% discount for workloads that can tolerate a 24-hour turnaround.
- Batch API at 50% discount vs real-time pricing
- 24-hour turnaround window
- Separate rate-limit bucket from sync API
Source
March 4, 2024Backfilled
Claude 3 family launch
Anthropic launched the Claude 3 family (Opus, Sonnet, Haiku) with 200K-token context across all tiers.
- Claude 3 Opus at $15/$75 per 1M tokens
- Claude 3 Sonnet at $3/$15 per 1M tokens
- Claude 3 Haiku at $0.25/$1.25 per 1M tokens
- 200K context across all three tiers
Source
February 15, 2024Backfilled
Gemini 1.5 Pro preview
Google previewed Gemini 1.5 Pro with a 1M-token context window, a 5× jump over the 200K ceiling of competing flagships at the time.
- Gemini 1.5 Pro preview at 1M-token context
- Free preview in AI Studio at launch
- Standard/long-context dual-tier pricing introduced
Source
January 25, 2024Backfilled
GPT-3.5 Turbo 50% price cut + text-embedding-3
OpenAI cut GPT-3.5 Turbo input by 50% and output by 25%, and launched text-embedding-3-small / -large.
- GPT-3.5 Turbo input: $0.001 → $0.0005 per 1K tokens
- GPT-3.5 Turbo output: $0.002 → $0.0015 per 1K tokens
- text-embedding-3-small at $0.00002 per 1K tokens (5× cheaper than ada-002)
- text-embedding-3-large at $0.00013 per 1K tokens
Source
September 4, 2024Backfilled
Claude 3.5 Sonnet 8K output
Anthropic raised the max output of Claude 3.5 Sonnet from 4K to 8K tokens without a pricing change.
- Claude 3.5 Sonnet max output raised 4K → 8K tokens
- No change to per-token pricing
Source
April 9, 2024Backfilled
Claude 3 Haiku on Bedrock
Anthropic expanded Claude 3 Haiku availability to Amazon Bedrock at parity pricing with the direct API.
- Claude 3 Haiku GA on Amazon Bedrock
- Pricing parity with direct Anthropic API
Source
February 1, 2024Backfilled
gpt-3.5-turbo-0125 + GPT-4 Turbo 0125-preview
OpenAI released updated GPT-3.5 Turbo and GPT-4 Turbo snapshots fixing the 'lazy model' regression reported in late 2023.
- gpt-3.5-turbo-0125 with improved instruction following
- gpt-4-0125-preview addresses December 'lazy model' reports
Source
December 6, 2023Backfilled
Gemini 1.0 API launch
Google launched the Gemini 1.0 API (Pro tier) as the first Gemini-branded model family, replacing PaLM 2.
- Gemini 1.0 Pro API launched
- First Gemini-branded model family
- Replaced PaLM 2 on Google's generative stack
Source
November 21, 2023Backfilled
Claude 2.1 with 200K context
Anthropic released Claude 2.1 with a 200K-token context window, 2.4× the 80K ceiling of Claude 2.0.
- Claude 2.1 at 200K-token context
- Tool use beta introduced
- Pricing held at $8/$24 per 1M tokens
Source
November 6, 2023Backfilled
GPT-4 Turbo launch (DevDay 2023)
OpenAI launched GPT-4 Turbo at $10/$30 per 1M tokens (3× cheaper input and 2× cheaper output than GPT-4) with a 128K context.
- GPT-4 Turbo at $10/$30 per 1M tokens (vs $30/$60 for GPT-4 8K)
- 128K context window
- JSON mode + seed parameter introduced
Source
July 11, 2023Backfilled
Claude 2 API launch
Anthropic launched the Claude 2 API with 100K-token context and a public developer waitlist.
- Claude 2 at $11.02/$32.68 per 1M tokens
- 100K-token context at launch
- First publicly accessible Claude API
Source
June 13, 2023Backfilled
GPT-3.5 Turbo 16K + function calling
OpenAI released gpt-3.5-turbo-16k and introduced function calling across the GPT-4 and 3.5 families.
- gpt-3.5-turbo-16k at $0.003/$0.004 per 1K tokens (2× base rate)
- Function calling added to GPT-4 and GPT-3.5 Turbo
- gpt-3.5-turbo base input cut by 25% to $0.0015/1K
Source
March 14, 2023Backfilled
GPT-4 launch
OpenAI launched GPT-4 at $30/$60 per 1M tokens (8K context) and $60/$120 per 1M tokens (32K context).
- GPT-4 8K at $30/$60 per 1M tokens
- GPT-4 32K at $60/$120 per 1M tokens
- Initial waitlist-gated developer access
Source
March 1, 2023Backfilled
GPT-3.5 Turbo API launch
OpenAI launched the GPT-3.5 Turbo API at $0.002 per 1K tokens, a 10× price cut vs text-davinci-003.
- GPT-3.5 Turbo at $0.002 per 1K tokens (both in + out)
- 10× price cut versus text-davinci-003
- First ChatGPT-style chat-completions API
Source

Q2 2026 pricing report published

Claude Opus 4.7 added

Model expansion + pricing updates

Initial pricing database

Claude Opus 4.6 released

OpenAI GPT-5.4 family launch

Gemini 3 Flash (preview)

Gemini 3.1 Pro (preview)

Gemini 3.1 Flash-Lite (preview)

Claude Sonnet 4.6 released

Claude Opus 4.5 released

Claude Haiku 4.5 released

Claude Sonnet 4.5 released

Gemini 2.5 Flash-Lite GA

OpenAI GPT-5 family launch

Claude Opus 4.1 released

Gemini 2.5 Flash GA pricing

Claude 4: Opus 4 and Sonnet 4 launch

OpenAI o3 and o4-mini launch

OpenAI GPT-4.1 family launch

Gemini 2.5 Pro launch

Claude 3.7 Sonnet with extended thinking

Gemini 2.0 Pro Experimental

OpenAI o3-mini launch

OpenAI o1 full release

Gemini 2.0 Flash launch

Claude 3.5 Haiku launch

Claude 3.5 Sonnet v2 + computer use

Anthropic Message Batches API

OpenAI prompt caching

Gemini 1.5 Pro 64% price cut

OpenAI o1-preview + o1-mini launch

Anthropic prompt caching (beta)

GPT-4o price cut

GPT-4o mini launch

Claude 3.5 Sonnet launch

Gemini 1.5 Flash launch

GPT-4o launch

OpenAI Batch API

Claude 3 family launch

Gemini 1.5 Pro preview

GPT-3.5 Turbo 50% price cut + text-embedding-3

Claude 3.5 Sonnet 8K output

Claude 3 Haiku on Bedrock

gpt-3.5-turbo-0125 + GPT-4 Turbo 0125-preview

Gemini 1.0 API launch

Claude 2.1 with 200K context

GPT-4 Turbo launch (DevDay 2023)

Claude 2 API launch

GPT-3.5 Turbo 16K + function calling

GPT-4 launch

GPT-3.5 Turbo API launch