Distributed Cache Architecture
Production caching is L1 (Caffeine in-process) + L2 (Redis cluster) with explicit invalidation, TTL jitter, and stampede protection — Amazon catalog pattern.
Introduction
Production caching is L1 (Caffeine in-process) + L2 (Redis cluster) with explicit invalidation, TTL jitter, and stampede protection — Amazon catalog pattern. Cache without architecture delivers stale prices, thundering herds, and debugging nightmares.
Cache-aside remains default; write-through for strong consistency needs. Event-driven invalidation via Kafka + @CacheEvict scales better than TTL-only for volatile data.
Document what is cached, TTL, invalidation owner, and acceptable staleness per domain.
Understanding the topic
Key concepts
- Cache-aside — app manages load and populate.
- Write-through — write cache and DB together.
- L1 local + L2 Redis two-tier.
- Invalidation — TTL, event, version stamp.
- Stampede — sync, jitter, early refresh.
- Negative cache — cache misses for unknown keys.
flowchart LRApp -->|GET key| RedisRedis -->|miss| DBDB -->|populate| RedisRedis --> App
Step-by-step explanation
- Read: L1 → L2 → DB on miss.
- Populate L2 then optionally L1.
- Write: update DB then evict/publish invalidation.
- Other nodes drop L1 on message.
- Metrics track hit rate per cache name.
Syntax reference
Common commands
- Version in key avoids stale read during rollout.
- Negative cache short TTL prevents penetration.
- Document staleness SLO per cache.
# Cache key convention{service}:{entity}:{id}:v{version}# Invalidation pubPUBLISH cache:invalidate catalog:product:42
Informative example
Two-tier cache service — Caffeine L1 + Redis L2:
@Servicepublic class TwoTierProductCache {private final Cache<String, Product> local = Caffeine.newBuilder().maximumSize(10_000).expireAfterWrite(Duration.ofMinutes(1)).build();private final StringRedisTemplate redis;public Product get(String sku) {return local.get(sku, k -> {String json = redis.opsForValue().get("product:" + k);if (json != null) return parse(json);Product p = db.load(k);redis.opsForValue().set("product:" + k, serialize(p), Duration.ofMinutes(30));return p;});}}
Subscribe to invalidation channel to local.invalidate(sku) on updates. Java 21 · Spring Boot 3.
Real-world use
Real-world use cases
- Product catalog L1+L2 (Amazon).
- Feed slice cache with Kafka invalidation (LinkedIn).
- Config service read-through cache.
- Geo lookup cache with regional TTL.
- Permission bitmap cache per request.
Best practices
- ADR per cache: TTL, invalidation, owner.
- Jitter TTL on hot keys.
- Monitor hit rate and staleness incidents.
- Negative cache for scanner protection.
- Version keys on schema change deploy.
- Load test cache miss scenarios.
Common mistakes
- TTL-only for price/inventory — stale checkout.
- No L1 invalidation on L2 evict.
- Caching without size limits — OOM local.
- Same cache key different serializers across services.
Advanced interview questions
Q1BeginnerCache-aside steps?
Q2BeginnerL1 vs L2 cache?
Q3IntermediateCache stampede fix?
Q4IntermediateCache penetration?
Q5AdvancedDesign catalog cache for flash sale?
Summary
L1 Caffeine + L2 Redis is common pattern.