Web & Dev

AI Model Parameter Sizes

Notable LLM and vision model sizes — parameter counts, context windows, and notes.

Approximate, public figures. Private/lab models vary; closed-source parameter counts are estimates.

Large language models

ModelParametersContextNotes
GPT-21.5 B1 024Early OpenAI transformer (2019)
GPT-3175 B2 0482020 — few-shot breakthrough
GPT-3.5-turbo~20 B16 KChatGPT launch model
GPT-4undisclosed (~1 T MoE estimate)8 K / 32 K
GPT-4-turboundisclosed128 KNov 2023
GPT-4oundisclosed128 KOmni, May 2024
Claude 1undisclosed100 KAnthropic
Claude 2 / 2.1undisclosed200 K
Claude 3 (Opus/Sonnet/Haiku)undisclosed200 KMarch 2024
Claude 4undisclosed200 K / 1 M2025
Llama 2 7B / 13B / 70B7 / 13 / 70 B4 KMeta open weights
Llama 3 8B / 70B / 405B8 / 70 / 405 B8 K / 128 K
Mistral 7B7 B8 K / 32 KSWA + GQA
Mixtral 8×7B47 B total / 13 B active32 KMixture of experts
Gemini 1.5 Proundisclosed2 MGoogle long-context
Gemini 2.0 / 2.5undisclosed1 M
Qwen 2.5 72B72 B128 KAlibaba open weights
DeepSeek V3671 B total / 37 B active128 KMoE, 2024

Embedding models

ModelDimNotes
OpenAI text-embedding-3-small1 536 (or truncated)Replaces ada-002
OpenAI text-embedding-3-large3 072
Cohere embed-v31 024
Voyage v31 024
BGE-large-en1 024Open, BAAI
E5-mistral-7b-instruct4 096MTEB top-tier

Image / vision

ModelSizeNotes
Stable Diffusion 1.5860 MU-Net
Stable Diffusion XL3.5 B
Stable Diffusion 32 B – 8 BMM-DiT
FLUX.1 (dev)12 BBlack Forest Labs
DALL·E 3undisclosed
CLIP ViT-L/14428 MOpenAI

Notes

  • Parameters alone don't determine quality — data, training, and alignment matter. Use benchmarks + evaluation not raw size.
Was this article helpful?