Question 1

Why are coding agents so expensive?

Accepted Answer

Every user request fans out into many LLM calls, and each call carries heavy context (file contents, tool schemas, previous messages). A single "fix this bug" request typically costs 10-50× what a chatbot reply of the same apparent user-facing size would.

Question 2

Which model is cheapest for agentic coding?

Accepted Answer

Claude Haiku 4.5 and Gemini 2.5 Flash are dramatically cheaper per token, but they're also noticeably worse at multi-step planning. For production agents, Sonnet 4.6 or GPT-5-mini usually win on cost-per-successful-task; Opus 4.7 and GPT-5 win when task completion matters more than per-request cost.

Question 3

How much does prompt caching save on a coding agent?

Accepted Answer

40-85%, depending on how often the same files and tool schemas appear across calls in the cache window. Anthropic's 5-minute TTL is usually long enough for a user's active session; OpenAI's caching is automatic but only 25-50% off.

Question 4

Does the calculator include extended thinking / reasoning tokens?

Accepted Answer

No: it treats "output" as the final answer. If you enable extended thinking (Claude) or reasoning tokens (OpenAI o-series), add an extra 500-3,000 output tokens per call depending on the task; those bill at the output rate.

Question 5

Should I self-host a model for cost reasons?

Accepted Answer

Only at >$3k/month API spend, and only if you have GPU ops expertise. For most teams, optimising the prompt and caching aggressively clears more cost than switching to self-hosted open-weight models, with zero quality hit.

Model	Per request	Per day	Monthly (×30)	Details
GPT-5 nanoopenai	$0.00624	$0.62	$18.72	View →
Gemini 2.5 Flash-Litegoogle	$0.01224	$1.22	$36.72	View →
GPT-4o miniopenai	$0.01836	$1.84	$55.08	View →
GPT-5.4 nanoopenai	$0.02475	$2.48	$74.25	View →
Gemini 3.1 Flash-Lite (preview)google	$0.03090	$3.09	$92.70	View →
GPT-5.5 nanoopenai	$0.03090	$3.09	$92.70	View →
GPT-5 miniopenai	$0.03120	$3.12	$93.60	View →
Gemini 2.5 Flashgoogle	$0.03750	$3.75	$112.50	View →

Coding agent cost calculator

Workload parameters

Top 8 cheapest models for this workload

Scaling GPT-5 nano

Optimisation tips

Frequently asked

Need a precise number for your actual prompt?