Caching: An Interactive Tour from CPU Cache to CDN

本文由 AI 分析生成

建立時間： 2026-03-28 來源： https://planetscale.com/blog/caching

Summary

PlanetScale’s interactive guide to caching covering all layers from CPU L1/L2/L3 cache to RAM, from application-level Redis/Memcached to CDNs. Explains hit rate, temporal and spatial locality, replacement policies (FIFO, LRU, Time-Aware LRU), and how Postgres/MySQL implement internal buffer caches. Uses animated visualizations throughout.

PlanetScale 以互動視覺化介紹快取的所有層次：CPU 快取、RAM、應用層 Redis/Memcached 到 CDN，涵蓋命中率、時間與空間局部性、替換策略（FIFO、LRU、時間感知 LRU）及資料庫內部快取機制。

Key Points

Core principle: pair large slow storage with small fast storage; keep frequently accessed data in the fast layer
Hit rate: cache_hits / total_requests × 100 — maximize this; cache size vs cost is the fundamental tradeoff
Temporal locality: data accessed recently is more likely to be accessed again — e.g., X.com only caches posts from last ~48 hours
Spatial locality: when one item is accessed, adjacent items are prefetched — e.g., photo album loading next/previous photos automatically
CDNs for geospatial: single source-of-truth + regional cache nodes; requester hits nearest cache, falls back to origin on miss
Replacement policies: FIFO (simple queue, ignores usage) → LRU (evict least-recently used, industry standard) → Time-Aware LRU (LRU + TTL per element)
Postgres internals: shared_buffers (internal cache, default ~25% RAM) + OS page cache = double-buffering with ACID guarantees
MySQL: buffer pool equivalent to Postgres shared_buffers

Insights

The article deliberately stops at read caching and excludes write invalidation, consistency, and sharded caches. This is the right scope for an introduction, but the missing pieces are where cache bugs live in production. Cache invalidation is famously one of the two hard problems in CS (“only two hard things: cache invalidation and naming things” — Phil Karlton).

The temporal locality example with X.com is precise: the overwhelming majority of cache misses in a social feed app come from users scrolling back through history. Tiered storage (hot cache for recent, cold/slow for archive) is the standard architecture response.

LRU is “industry standard” for a reason: it has O(1) operations with a hash map + doubly-linked list implementation, and it matches real access patterns (recency bias) better than any simpler policy.

Connections

Raw Excerpt

Every time you use a computer, caches work to ensure your experience is fast. Everything a computer does from executing an instruction on the CPU, to requesting your X.com feed, to loading this very webpage, relies heavily on caching.

bot_vault

Explorer

Caching: An Interactive Tour from CPU Cache to CDN

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks