Redis Tutorial 0/42 lessons ~6 min read Lesson 29“In-memory data structures to distributed systems”

Sharding

Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy)…

Course progress0%

Focus

10 guided sections

Practice signal

Examples included

Career prep

Interview Q&A included

Introduction

Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy), and proxies (Twemproxy, Envoy). Each trades operational complexity vs flexibility.

Amazon-scale catalogs shard by ASIN prefix or slot; poor shard key choice creates hot shards worse than hot keys. Resharding requires migration plan — Cluster handles slot move; client-side needs dual-write or downtime.

Prefer Redis Cluster for new systems unless proxy features mandate otherwise.

Understanding the topic

Key concepts

Horizontal partition by key hash.
Client-side: hash(key) % N nodes.
Cluster: CRC16 slots native.
Proxy: transparent routing layer.
Resharding — add nodes rebalance slots.
Hot shard — uneven key distribution.

text

flowchart LR
  App -->|hash tag| Slot
  Slot --> Node1
  Slot --> Node2

Step-by-step explanation

Application or proxy computes target shard.
Each shard independent single-threaded Redis.
Cross-shard ops need application aggregation.
Cluster gossip updates routing table.
Failure isolated per shard with replicas.

Syntax reference

Common commands

Avoid manual client % N when Cluster available.
reshard: redis-cli --cluster reshard.
Monitor per-node memory balance.

bash

# Client-side mental model
slot = crc16(key) % num_shards
# Cluster native
slot = CRC16(key) & 16383

# Inspect
redis-cli --cluster check host:6379

Informative example

Check cluster slot balance across nodes:

bash

redis-cli --cluster info host1:6379
redis-cli --cluster check host1:6379
# Ensure each master ~5461 slots and even memory

Uneven slots or big keys on one shard cause pain. Rebalance before hitting 90% memory on one node.

Real-world use

Real-world use cases

Multi-TB session/cache dataset.
Millions QPS write spread.
Tenant isolation by shard key.
Gradual migration from single node.
Legacy Twemproxy fronting Redis fleet.

Best practices

Choose shard key with even distribution.
Native Cluster over DIY hashing for greenfield.
Monitor per-shard memory and ops/sec.
Rebalance proactively not reactively.
Document hash tag conventions.
Plan resharding runbook before launch.

Common mistakes

Sequential userId shard key — hot latest users.
Cross-shard multi-key transactions impossible.
Forgetting replica per shard in Cluster.
Reshard during peak without capacity buffer.

Advanced interview questions

Q1BeginnerWhy shard Redis?

Exceed single node RAM or write QPS ceiling.

Q2BeginnerCluster vs client sharding?

Cluster native slot mgmt and failover; client sharding manual node list.

Q3IntermediateTwemproxy role?

Proxy layer routing to backends; predates wide Cluster adoption.

Q4IntermediateReshard without downtime?

Cluster slot migration with ASK; dual-write migration for client-side.

Q5AdvancedPick shard key for social graph?

userId hash not celebrity id; isolate hot users; optional local L1 for celebrities.

Summary

Sharding splits data across Redis nodes.

Ready to mark this lesson complete?Track your journey across the entire course.