High-Level Design Tutorial 0/42 lessons ~6 min read Lesson 36

    Chat Application

    Design a chat application like WhatsApp or Slack — 1:1 and group messaging, online presence, delivery receipts, and message history.

    Course progress0%
    Focus
    10 guided sections
    Practice signal
    Examples included
    Career prep
    Interview Q&A included

    Introduction

    Design a chat application like WhatsApp or Slack — 1:1 and group messaging, online presence, delivery receipts, and message history. Real-time HLD centers on WebSockets, message ordering per conversation, fan-out for groups, and mobile push for offline users.

    Assume millions of concurrent connections, billions of messages stored, at-least-once delivery with client dedup, and end-to-end encryption as optional advanced topic.

    Understanding the topic

    Key concepts

    • WebSocket gateway horizontal scale with connection stickiness or shared pub/sub.
    • Message store: Cassandra partition by conversation_id + time UUID.
    • Delivery flow: sent → delivered → read receipts via separate lightweight events.
    • Presence: heartbeat + Redis TTL online set; last seen timestamp.
    • Group chat: fan-out on write for small groups; fan-out on read for large channels.
    • Push notifications (APNs/FCM) when recipient offline.
    text
    flowchart TB
    Client --> WS[WebSocket Gateway]
    WS --> ChatSvc
    ChatSvc --> Kafka
    Kafka --> Presence
    ChatSvc --> Cassandra

    Internal architecture

    Architecture overview

    text
    flowchart TB
    Client --> WS[WebSocket Gateway]
    WS --> ChatSvc
    ChatSvc --> Kafka
    Kafka --> Presence
    ChatSvc --> Cassandra

    Step-by-step explanation

    1. Client WSS → WebSocket Gateway cluster → Chat Service.
    2. Send message → persist Cassandra → publish to Kafka topic conversationId.
    3. Online recipients: gateway subscribed to user channel via Redis pub/sub or Kafka consumer pushes WS frame.
    4. Offline: Notification service FCM push with message preview policy.
    5. Media messages: upload to S3 presigned URL; message body stores URL reference.
    6. History sync: paginated GET /conversations/{id}/messages?before=cursor.

    Informative example

    Message send API and Kafka fan-out to WebSocket delivery workers:

    java
    @RestController
    @RequestMapping("/api/v1/conversations/{cid}/messages")
    public class MessageController {
    private final MessageService messages;
    public MessageController(MessageService messages) { this.messages = messages; }
    @PostMapping
    public MessageDto send(@PathVariable String cid, @RequestBody SendMessageRequest req,
    @AuthenticationPrincipal Jwt jwt) {
    return messages.send(cid, jwt.getSubject(), req.body());
    }
    }
    @Service
    public class MessageService {
    private final MessageRepository cassandra;
    private final KafkaTemplate<String, ChatMessageEvent> kafka;
    public MessageDto send(String conversationId, String senderId, String body) {
    Message msg = cassandra.save(Message.create(conversationId, senderId, body));
    kafka.send("chat.messages", conversationId,
    new ChatMessageEvent(msg.id(), conversationId, senderId, body, msg.sentAt()));
    return MessageDto.from(msg);
    }
    }

    Partition Kafka by conversationId for ordering. WS gateway scales on connection count — separate from REST API.

    Real-world use

    Real-world use cases

    • Enterprise Slack-like team collaboration.
    • Telehealth secure messaging HIPAA audit.
    • In-app e-commerce buyer-seller chat.
    • Game guild chat with low latency.

    Best practices

    • Idempotent send with client-generated message UUID.
    • Paginate history — never load full conversation.
    • Backpressure on gateway if client slow consumer.
    • Encrypt TLS everywhere; E2EE optional product decision.
    • Moderation pipeline for abuse reporting async.
    • Load test connection count per gateway pod.

    Common mistakes

    • Polling HTTP instead of WebSocket — battery and latency fail.
    • Fan-out on write to 10k member channel — write amplification.
    • No ordering guarantee per conversation.
    • Storing large media in message row — bloat.
    • Single gateway SPOF without horizontal scale plan.

    Advanced interview questions

    Q1BeginnerWebSocket vs long polling for chat?
    WebSocket full-duplex low latency; long polling fallback only — primary real-time path WS.
    Q2BeginnerHow store chat messages?
    Wide-column store partitioned by conversationId + time-ordered messageId for efficient range queries.
    Q3IntermediateFan-out on write vs read for groups?
    Write fan-out small groups; read fan-out large channels (WhatsApp broadcast lists hybrid).
    Q4IntermediateOnline presence implementation?
    Heartbeats update Redis key user:online TTL; friends query MGET or pub/sub presence change.
    Q5AdvancedDesign WhatsApp-scale 2B users.
    WS gateway geo sharded, Cassandra multi-DC, Kafka message pipeline, E2EE client-side keys, push FCM/APNs, media S3, group size tiers different fan-out, CDN not for messages.

    Summary

    Chat HLD combines WebSockets, durable message store, and push. Partition messages by conversation for ordering and queries. Fan-out strategy depends on group size. Presence via Redis TTL heartbeats. Kafka bridges persistence to live delivery workers. Food delivery adds dispatch geo-matching complexity.

    Ready to mark this lesson complete?Track your journey across the entire course.