Computer Memory Architecture
DRAM, cache, virtual memory, and paging — how CPUs access data.
Reference
DRAM generations
| Type | Data rate | Peak bandwidth (single channel) |
|---|---|---|
| DDR3-1600 | 1600 MT/s | 12.8 GB/s |
| DDR4-2400 | 2400 MT/s | 19.2 GB/s |
| DDR4-3200 | 3200 MT/s | 25.6 GB/s |
| DDR5-4800 | 4800 MT/s | 38.4 GB/s |
| DDR5-6400 | 6400 MT/s | 51.2 GB/s |
| LPDDR5 | 6400 MT/s | mobile / soldered |
| HBM2e | 3.6 Gbps/pin | ~460 GB/s stack |
| HBM3 | 6.4 Gbps/pin | ~800 GB/s stack |
Virtual memory
- Page size
- 4 KB default; huge pages 2 MB or 1 GB
- Page table
- Maps virtual → physical — multi-level (typically 4 on x86-64)
- TLB
- Caches recent translations
- TLB miss
- Costs ~100 ns on x86; use huge pages for large working sets
- Swap / paging
- Move cold pages to disk — modern systems avoid swap when possible
Cache behavior
- Cache line
- 64 bytes on x86 / ARM
- Associativity
- 8–16-way typical
- Coherence
- MESI / MOESI between cores
- False sharing
- Different cores hitting same cache line — pad to 64 B
- Write-back
- Dirty lines flushed to next level on eviction
Tips
- Design data for locality — contiguous arrays beat linked lists for iteration.
- Align hot data on cache-line boundaries.
- Prefetch: modern CPUs detect sequential access automatically; manual prefetch hints help for irregular patterns.
- Avoid false sharing between threads by padding or separating per-thread data.
Last updated: