AI Token Limits & Costs

Context windows and per-token prices for major LLM APIs — ballpark figures for cost estimation.

Reference Reference Updated Apr 19, 2026
Reference

Prices are USD per 1 M tokens. Check provider docs for latest — these change often.

OpenAI

Model Context Input / 1M Output / 1M
GPT-4o 128K $2.50 $10.00
GPT-4o mini 128K $0.15 $0.60
GPT-4.1 1M $2.00 $8.00
GPT-4.1 mini 1M $0.40 $1.60
GPT-4.1 nano 1M $0.10 $0.40
o1 200K $15.00 $60.00
o3-mini 200K $1.10 $4.40

Anthropic

Model Context Input / 1M Output / 1M
Claude 3.5 Haiku 200K $0.80 $4.00
Claude 3.5 Sonnet 200K $3.00 $15.00
Claude 3 Opus 200K $15.00 $75.00
Claude 4 Sonnet 200K $3.00 $15.00
Claude 4 Opus 200K $15.00 $75.00

Google

Model Context Input / 1M Output / 1M
Gemini 2.0 Flash 1M $0.10 $0.40
Gemini 2.5 Flash 1M $0.15 $0.60
Gemini 2.5 Pro 1M $1.25 (≤200K) / $2.50 $10.00 / $15.00

Notes

  • Cached input can drop prices by 50–90% — investigate prompt caching when reusing system prompts.
  • Batch pricing offers ~50% off at 24 h latency on OpenAI and Anthropic.
  • 1 token ≈ 4 English characters / 0.75 word on average. Code and non-English languages use more tokens.

Last updated: