Computer Architecture Diagram
Modern CPU architecture overview — cores, caches, pipelines, memory controller, and interconnect.
Reference
Major blocks
| Block | Role |
|---|---|
| Core | Executes instructions — fetch, decode, execute, retire |
| L1 instruction / data cache | Small, per-core, 1–2 cycle access |
| L2 cache | Larger per-core or shared pair, ~10 cycles |
| L3 / last-level cache (LLC) | Shared across cores on socket, tens of cycles |
| Memory controller | Interfaces to DDR DRAM (2–4 channels) |
| PCIe root complex | Connects to GPU, NVMe, network |
| Chipset / IO hub | USB, SATA, slower IO |
| Coherence fabric | Ring, mesh, or point-to-point between cores |
| Power / clock management | DVFS, C-states, P-states |
Pipeline stages (simplified)
- Fetch
- Get instructions from I-cache; branch prediction
- Decode
- Convert to micro-ops (µops)
- Rename
- Map architectural → physical registers
- Dispatch
- Issue to reservation stations
- Execute
- Integer / FP / load-store units
- Writeback
- Store result in physical register
- Retire
- Commit to architectural state (in program order)
Modern CPU features
- Superscalar: multiple instructions issue per cycle.
- Out-of-order: executes ready ops first; retires in order.
- SMT / hyperthreading: 2 threads share one core's front end.
- SIMD (AVX, NEON): operate on many data elements per instruction.
- Branch prediction: modern predictors reach >95% accuracy.
- Speculative execution: guess branches; rollback if wrong (Spectre/Meltdown mitigations).
- NUMA: multiple sockets with local memory — pin workloads to local CPU.
- Chiplets / tiles: AMD / Intel split cores and IO across dies.
Last updated: