High-Level Design Tutorial 0/42 lessons ~6 min read Lesson 35

    URL Shortener

    Design a URL shortener like bit.ly — map long URLs to short slugs, redirect with low latency at global scale.

    Course progress0%
    Focus
    10 guided sections
    Practice signal
    Examples included
    Career prep
    Interview Q&A included

    Introduction

    Design a URL shortener like bit.ly — map long URLs to short slugs, redirect with low latency at global scale. Classic interview problem testing hashing, read-heavy capacity math, CDN caching, and collision handling.

    Assume 100:1 read/write ratio, custom aliases for premium users, analytics click counts, and 5-year retention. This lesson walks a complete HLD with APIs, storage, and scaling.

    Understanding the topic

    Key concepts

    • Base62 encoding of auto-increment ID or hash — 7 chars = 62^7 combinations.
    • Read path latency critical — cache + CDN for redirects.
    • Write path: uniqueness check, optional custom alias validation.
    • Analytics async — don't slow redirect path.
    • Expired/deleted URL → 410 Gone.
    • Rate limit creates to prevent abuse.
    text
    flowchart TB
    Client --> LB --> API
    API --> Redis
    API --> DB[(Cassandra)]
    Client --> CDN --> Redirect

    Internal architecture

    Architecture overview

    text
    flowchart TB
    Client --> LB --> API
    API --> Redis
    API --> DB[(Cassandra)]
    Client --> CDN --> Redirect

    Step-by-step explanation

    1. POST /api/v1/urls {longUrl, customAlias?} → API → generate slug → store → return short URL.
    2. GET /{slug} → CDN edge → on miss API/redirect service → 301 to long URL.
    3. PostgreSQL or Cassandra for slug→url mapping; Redis cache hot slugs.
    4. Kafka click events → Flink aggregate counts → analytics DB.
    5. ID generation: Snowflake or DB sequence shard — avoid collision.
    6. Global: CDN caches 301 with long TTL; purge on URL update/delete.

    Informative example

    Create and redirect flow — Spring REST + Redis cache-aside:

    java
    @RestController
    public class UrlController {
    private final UrlService urls;
    public UrlController(UrlService urls) { this.urls = urls; }
    @PostMapping("/api/v1/urls")
    ResponseEntity<ShortUrlResponse> create(@Valid @RequestBody CreateUrlRequest req) {
    return ResponseEntity.status(201).body(urls.create(req));
    }
    @GetMapping("/{slug}")
    ResponseEntity<Void> redirect(@PathVariable String slug) {
    String longUrl = urls.resolve(slug);
    urls.recordClickAsync(slug);
    return ResponseEntity.status(HttpStatus.MOVED_PERMANENTLY)
    .location(URI.create(longUrl)).build();
    }
    }
    @Service
    public class UrlService {
    public String resolve(String slug) {
    String cached = redis.get("url:" + slug);
    if (cached != null) return cached;
    String longUrl = db.findLongUrl(slug);
    redis.setex("url:" + slug, 86400, longUrl);
    return longUrl;
    }
    }

    Capacity: 35k redirect QPS peak → Redis + CDN. Writes ~3.5k QPS → Cassandra or sharded SQL.

    Real-world use

    Real-world use cases

    • Marketing campaign tracking links in e-commerce.
    • Social media character limits — short share URLs.
    • SMS banking alerts with compact links.
    • QR code generation for restaurant menus.

    Best practices

    • Reserve slug dictionary — block offensive custom aliases.
    • Scan long URLs for malware before storing.
    • 301 permanent redirect SEO-friendly; cacheable.
    • Separate analytics pipeline from redirect hot path.
    • Use consistent hashing if sharding slug space.
    • Monitor cache hit ratio on redirects.

    Common mistakes

    • 302 instead of 301 — hurts CDN cache and SEO.
    • Synchronously increment click count on redirect — latency spike.
    • Hash-only slug without collision handling.
    • No rate limit on create — spam storage.
    • Storing only hash — can't support custom aliases easily.

    Advanced interview questions

    Q1BeginnerHow generate short URL slug?
    Base62 encode monotonic ID or hash truncated with collision retry; custom alias separate uniqueness check.
    Q2BeginnerWhy CDN for URL shortener?
    Read-heavy redirects cache at edge — reduces origin QPS dramatically.
    Q3IntermediateSQL vs NoSQL for URL store?
    Cassandra/DynamoDB for write scale and TTL; PostgreSQL fine until tens of thousands writes/sec with sharding.
    Q4IntermediateHandle custom alias collision?
    Transactional unique index on alias; return 409 Conflict if taken.
    Q5AdvancedScale to 1B URLs and 100k redirect QPS.
    CDN 301 cache, Redis cluster millions hot keys, Cassandra 100+ nodes sharded by slug, async analytics Kafka, multi-region active-active read, snowflake IDs.

    Summary

    URL shortener is read-heavy — CDN + Redis dominate design. Base62 IDs or hashes with collision strategy for slug generation. Async analytics off critical redirect path. 301 redirects cache well at edge. Rate limit and malware scan on create. Chat system adds real-time bidirectional requirements next.

    Ready to mark this lesson complete?Track your journey across the entire course.