Sharding
Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy)…
Introduction
Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy), and proxies (Twemproxy, Envoy). Each trades operational complexity vs flexibility.
Amazon-scale catalogs shard by ASIN prefix or slot; poor shard key choice creates hot shards worse than hot keys. Resharding requires migration plan — Cluster handles slot move; client-side needs dual-write or downtime.
Prefer Redis Cluster for new systems unless proxy features mandate otherwise.
Understanding the topic
Key concepts
- Horizontal partition by key hash.
- Client-side: hash(key) % N nodes.
- Cluster: CRC16 slots native.
- Proxy: transparent routing layer.
- Resharding — add nodes rebalance slots.
- Hot shard — uneven key distribution.
flowchart LRApp -->|hash tag| SlotSlot --> Node1Slot --> Node2
Step-by-step explanation
- Application or proxy computes target shard.
- Each shard independent single-threaded Redis.
- Cross-shard ops need application aggregation.
- Cluster gossip updates routing table.
- Failure isolated per shard with replicas.
Syntax reference
Common commands
- Avoid manual client % N when Cluster available.
- reshard: redis-cli --cluster reshard.
- Monitor per-node memory balance.
# Client-side mental modelslot = crc16(key) % num_shards# Cluster nativeslot = CRC16(key) & 16383# Inspectredis-cli --cluster check host:6379
Informative example
Check cluster slot balance across nodes:
redis-cli --cluster info host1:6379redis-cli --cluster check host1:6379# Ensure each master ~5461 slots and even memory
Uneven slots or big keys on one shard cause pain. Rebalance before hitting 90% memory on one node.
Real-world use
Real-world use cases
- Multi-TB session/cache dataset.
- Millions QPS write spread.
- Tenant isolation by shard key.
- Gradual migration from single node.
- Legacy Twemproxy fronting Redis fleet.
Best practices
- Choose shard key with even distribution.
- Native Cluster over DIY hashing for greenfield.
- Monitor per-shard memory and ops/sec.
- Rebalance proactively not reactively.
- Document hash tag conventions.
- Plan resharding runbook before launch.
Common mistakes
- Sequential userId shard key — hot latest users.
- Cross-shard multi-key transactions impossible.
- Forgetting replica per shard in Cluster.
- Reshard during peak without capacity buffer.
Advanced interview questions
Q1BeginnerWhy shard Redis?
Q2BeginnerCluster vs client sharding?
Q3IntermediateTwemproxy role?
Q4IntermediateReshard without downtime?
Q5AdvancedPick shard key for social graph?
Summary
Sharding splits data across Redis nodes.