AI Token Limits & Costs

Context windows and per-token prices for major LLM APIs — ballpark figures for cost estimation.

Reference Reference Updated Apr 19, 2026

Prices are USD per 1 M tokens. Check provider docs for latest — these change often.

Model	Context	Input / 1M	Output / 1M
GPT-4o	128K	$2.50	$10.00
GPT-4o mini	128K	$0.15	$0.60
GPT-4.1	1M	$2.00	$8.00
GPT-4.1 mini	1M	$0.40	$1.60
GPT-4.1 nano	1M	$0.10	$0.40
o1	200K	$15.00	$60.00
o3-mini	200K	$1.10	$4.40

Model	Context	Input / 1M	Output / 1M
Claude 3.5 Haiku	200K	$0.80	$4.00
Claude 3.5 Sonnet	200K	$3.00	$15.00
Claude 3 Opus	200K	$15.00	$75.00
Claude 4 Sonnet	200K	$3.00	$15.00
Claude 4 Opus	200K	$15.00	$75.00

Model	Context	Input / 1M	Output / 1M
Gemini 2.0 Flash	1M	$0.10	$0.40
Gemini 2.5 Flash	1M	$0.15	$0.60
Gemini 2.5 Pro	1M	$1.25 (≤200K) / $2.50	$10.00 / $15.00

Notes

Cached input can drop prices by 50–90% — investigate prompt caching when reusing system prompts.
Batch pricing offers ~50% off at 24 h latency on OpenAI and Anthropic.
1 token ≈ 4 English characters / 0.75 word on average. Code and non-English languages use more tokens.

Last updated: April 19, 2026