Web & Dev

Memory Hierarchy Reference

CPU registers to persistent storage — sizes, latencies, and trade-offs.

Latency ladder (typical 2020s CPU)

LevelTypical sizeAccess latencyThroughput
Registers~1 KB (few hundred)< 1 cycle~TB/s per core
L1 cache32–64 KB per core~4 cycles (~1 ns)~TB/s
L2 cache512 KB – 2 MB per core~12 cycles (~3 ns)TB/s
L3 cache~MBs (shared)~40 cycles (~10 ns)~100s GB/s
DRAMGBs~100 ns30–80 GB/s per channel
NVMe SSDTBs10–100 µs3–14 GB/s
SATA SSDTBs50–200 µs~550 MB/s
HDD1–30 TB~10 ms (seek)~200 MB/s
TapeTBssecondsMB–GB/s (sequential)
Network (LAN)100 µs – 1 ms1–100 Gbit/s
Network (internet)10–100 msvaries

Intuitions

  • 1 cycle on a 3 GHz CPU ≈ 0.33 ns.
  • L1 hit vs DRAM: ~100× difference (1 ns vs 100 ns).
  • DRAM vs NVMe: ~1000× (100 ns vs 100 µs).
  • Cache-friendly code can run 10–100× faster than cache-oblivious for memory-bound work.
  • Jeff Dean's "latency numbers every programmer should know" — timeless reference.
Was this article helpful?