High-Level Design Tutorial 0/42 lessons ~6 min read Lesson 14

    CDN

    A Content Delivery Network (CDN) caches static and cacheable dynamic content at edge PoPs (points of presence) close to users, reducing latency and origin load.

    Course progress0%
    Focus
    10 guided sections
    Practice signal
    Examples included
    Career prep
    Interview Q&A included

    Introduction

    A Content Delivery Network (CDN) caches static and cacheable dynamic content at edge PoPs (points of presence) close to users, reducing latency and origin load. CloudFront, Akamai, Fastly, and Cloudflare serve images, JS/CSS, video segments, and even API responses with short TTLs.

    In HLD, CDNs are mandatory at scale for read-heavy assets and global user bases. URL shortener redirects, OTT video chunks, and product catalog images all benefit. Interviewers expect you to mention cache invalidation, TTL strategy, and origin shielding.

    This lesson covers CDN placement, caching headers, dynamic content acceleration, and cost considerations.

    Understanding the topic

    Key concepts

    • Edge cache: content stored geographically; cache hit avoids round trip to origin.
    • Cache key: URL path + query string rules + Vary headers (careful with auth).
    • TTL: Cache-Control max-age, s-maxage for shared caches, stale-while-revalidate.
    • Invalidation: purge by path or tag after deploy or content update.
    • Origin shield: secondary cache layer reducing origin thundering herd.
    • Dynamic site acceleration: CDN maintains persistent connections to origin for uncacheable API.
    text
    flowchart LR
    User --> Edge[CDN Edge]
    Edge -->|miss| Origin[Origin Server]

    Internal architecture

    Architecture overview

    text
    flowchart LR
    User --> Edge[CDN Edge]
    Edge -->|miss| Origin[Origin Server]

    Step-by-step explanation

    1. User DNS resolves to CDN anycast IP nearest PoP.
    2. PoP cache HIT → return immediately; MISS → fetch from origin (or shield).
    3. Static assets: long TTL + fingerprinted filenames (app.a1b2.js).
    4. API GET with public data: short TTL (5–60s) or edge side includes.
    5. Video HLS/DASH: segment files cached at edge; manifest low TTL.
    6. Signed URLs/cookies for private content (premium video, paid downloads).

    Informative example

    CloudFront-style cache behavior and Spring Cache-Control headers for product catalog:

    java
    @RestController
    @RequestMapping("/api/v1/products")
    public class ProductController {
    private final ProductService products;
    public ProductController(ProductService products) {
    this.products = products;
    }
    @GetMapping("/{id}")
    public ResponseEntity<ProductDto> get(@PathVariable String id) {
    ProductDto dto = products.findById(id);
    return ResponseEntity.ok()
    .cacheControl(CacheControl.maxAge(60, TimeUnit.SECONDS).cachePublic())
    .header("Vary", "Accept-Encoding")
    .body(dto);
    }
    }
    // CDN behavior (conceptual YAML)
    // default_ttl: 86400 for /static/*
    // min_ttl: 0 for /api/* — respect origin Cache-Control

    Never cache personalized or authenticated responses without Vary and private directive. URL shortener redirects cache well with long TTL.

    Real-world use

    Real-world use cases

    • E-commerce product images and JS bundles globally.
    • OTT: video segment delivery — 80%+ bandwidth from CDN.
    • Social: avatar and media thumbnails at edge.
    • News/fintech static marketing pages and public rate tables.

    Best practices

    • Fingerprint static assets for immutable long-term caching.
    • Use cache tags for bulk invalidation on catalog updates.
    • Monitor cache hit ratio — low ratio wastes CDN cost.
    • Geo-restrict content at CDN edge for licensing compliance.
    • Enable HTTP/2 and Brotli at CDN for smaller payloads.
    • Protect origin with CDN-only access (secret header) preventing bypass.

    Common mistakes

    • Caching Set-Cookie responses — users see wrong sessions.
    • Query string breaks cache key unintentionally (?v=1 on every deploy).
    • No purge strategy — stale prices after promotion.
    • CDN for all API traffic without TTL discipline — stale data bugs.
    • Ignoring HTTPS cert management at CDN for custom domains.

    Advanced interview questions

    Q1BeginnerWhat problem does a CDN solve?
    Reduces latency and origin load by caching content geographically close to users.
    Q2BeginnerWhat content should NOT be cached on CDN?
    Personalized authenticated responses, user-specific data, uncacheable POST results.
    Q3IntermediateHow invalidate CDN cache after deploy?
    Purge by path, cache tag, or versioned asset URLs; prefer fingerprinted filenames.
    Q4IntermediateCDN for API responses?
    Yes for public read-heavy GET with short TTL and cache keys excluding user-specific headers.
    Q5AdvancedDesign CDN for global video platform.
    Multi-CDN optional, HLS segments long TTL at edge, signed URLs for premium, origin shield, mid-tier cache, geo block, hit ratio SLO 95%, failover origin region.

    Summary

    CDN caches content at edge PoPs for latency and scale. Static assets use long TTL + fingerprinting; APIs use short TTL carefully. Invalidation and cache keys prevent stale or leaked data. OTT and URL shortener designs rely heavily on CDN. Origin shield protects backend from miss storms. Data layer choices (SQL vs NoSQL) follow CDN edge caching.

    Ready to mark this lesson complete?Track your journey across the entire course.