AI Model Parameter Sizes
Notable LLM and vision model sizes — parameter counts, context windows, and notes.
Reference
Approximate, public figures. Private/lab models vary; closed-source parameter counts are estimates.
Large language models
| Model | Parameters | Context | Notes |
|---|---|---|---|
| GPT-2 | 1.5 B | 1 024 | Early OpenAI transformer (2019) |
| GPT-3 | 175 B | 2 048 | 2020 — few-shot breakthrough |
| GPT-3.5-turbo | ~20 B | 16 K | ChatGPT launch model |
| GPT-4 | undisclosed (~1 T MoE estimate) | 8 K / 32 K | |
| GPT-4-turbo | undisclosed | 128 K | Nov 2023 |
| GPT-4o | undisclosed | 128 K | Omni, May 2024 |
| Claude 1 | undisclosed | 100 K | Anthropic |
| Claude 2 / 2.1 | undisclosed | 200 K | |
| Claude 3 (Opus/Sonnet/Haiku) | undisclosed | 200 K | March 2024 |
| Claude 4 | undisclosed | 200 K / 1 M | 2025 |
| Llama 2 7B / 13B / 70B | 7 / 13 / 70 B | 4 K | Meta open weights |
| Llama 3 8B / 70B / 405B | 8 / 70 / 405 B | 8 K / 128 K | |
| Mistral 7B | 7 B | 8 K / 32 K | SWA + GQA |
| Mixtral 8×7B | 47 B total / 13 B active | 32 K | Mixture of experts |
| Gemini 1.5 Pro | undisclosed | 2 M | Google long-context |
| Gemini 2.0 / 2.5 | undisclosed | 1 M | |
| Qwen 2.5 72B | 72 B | 128 K | Alibaba open weights |
| DeepSeek V3 | 671 B total / 37 B active | 128 K | MoE, 2024 |
Embedding models
| Model | Dim | Notes |
|---|---|---|
| OpenAI text-embedding-3-small | 1 536 (or truncated) | Replaces ada-002 |
| OpenAI text-embedding-3-large | 3 072 | |
| Cohere embed-v3 | 1 024 | |
| Voyage v3 | 1 024 | |
| BGE-large-en | 1 024 | Open, BAAI |
| E5-mistral-7b-instruct | 4 096 | MTEB top-tier |
Image / vision
| Model | Size | Notes |
|---|---|---|
| Stable Diffusion 1.5 | 860 M | U-Net |
| Stable Diffusion XL | 3.5 B | |
| Stable Diffusion 3 | 2 B – 8 B | MM-DiT |
| FLUX.1 (dev) | 12 B | Black Forest Labs |
| DALL·E 3 | undisclosed | |
| CLIP ViT-L/14 | 428 M | OpenAI |
Notes
- Parameters alone don't determine quality — data, training, and alignment matter. Use benchmarks + evaluation not raw size.
Last updated: