Redis Tutorial 0/42 lessons ~6 min read Lesson 29

    Sharding

    Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy)…

    Course progress0%
    Focus
    10 guided sections
    Practice signal
    Examples included
    Career prep
    Interview Q&A included

    Introduction

    Sharding splits data across Redis instances when one server is insufficient — options include Redis Cluster (native slots), client-side consistent hashing (Jedis/Lettuce legacy), and proxies (Twemproxy, Envoy). Each trades operational complexity vs flexibility.

    Amazon-scale catalogs shard by ASIN prefix or slot; poor shard key choice creates hot shards worse than hot keys. Resharding requires migration plan — Cluster handles slot move; client-side needs dual-write or downtime.

    Prefer Redis Cluster for new systems unless proxy features mandate otherwise.

    Understanding the topic

    Key concepts

    • Horizontal partition by key hash.
    • Client-side: hash(key) % N nodes.
    • Cluster: CRC16 slots native.
    • Proxy: transparent routing layer.
    • Resharding — add nodes rebalance slots.
    • Hot shard — uneven key distribution.
    text
    flowchart LR
    App -->|hash tag| Slot
    Slot --> Node1
    Slot --> Node2

    Step-by-step explanation

    1. Application or proxy computes target shard.
    2. Each shard independent single-threaded Redis.
    3. Cross-shard ops need application aggregation.
    4. Cluster gossip updates routing table.
    5. Failure isolated per shard with replicas.

    Syntax reference

    Common commands

    • Avoid manual client % N when Cluster available.
    • reshard: redis-cli --cluster reshard.
    • Monitor per-node memory balance.
    bash
    # Client-side mental model
    slot = crc16(key) % num_shards
    # Cluster native
    slot = CRC16(key) & 16383
    # Inspect
    redis-cli --cluster check host:6379

    Informative example

    Check cluster slot balance across nodes:

    bash
    redis-cli --cluster info host1:6379
    redis-cli --cluster check host1:6379
    # Ensure each master ~5461 slots and even memory

    Uneven slots or big keys on one shard cause pain. Rebalance before hitting 90% memory on one node.

    Real-world use

    Real-world use cases

    • Multi-TB session/cache dataset.
    • Millions QPS write spread.
    • Tenant isolation by shard key.
    • Gradual migration from single node.
    • Legacy Twemproxy fronting Redis fleet.

    Best practices

    • Choose shard key with even distribution.
    • Native Cluster over DIY hashing for greenfield.
    • Monitor per-shard memory and ops/sec.
    • Rebalance proactively not reactively.
    • Document hash tag conventions.
    • Plan resharding runbook before launch.

    Common mistakes

    • Sequential userId shard key — hot latest users.
    • Cross-shard multi-key transactions impossible.
    • Forgetting replica per shard in Cluster.
    • Reshard during peak without capacity buffer.

    Advanced interview questions

    Q1BeginnerWhy shard Redis?
    Exceed single node RAM or write QPS ceiling.
    Q2BeginnerCluster vs client sharding?
    Cluster native slot mgmt and failover; client sharding manual node list.
    Q3IntermediateTwemproxy role?
    Proxy layer routing to backends; predates wide Cluster adoption.
    Q4IntermediateReshard without downtime?
    Cluster slot migration with ASK; dual-write migration for client-side.
    Q5AdvancedPick shard key for social graph?
    userId hash not celebrity id; isolate hot users; optional local L1 for celebrities.

    Summary

    Sharding splits data across Redis nodes.

    Ready to mark this lesson complete?Track your journey across the entire course.