Load Balancers
A load balancer (LB) distributes incoming traffic across multiple backend servers to improve capacity, fault tolerance, and maintenance flexibility.
Introduction
A load balancer (LB) distributes incoming traffic across multiple backend servers to improve capacity, fault tolerance, and maintenance flexibility. Clients connect to the LB's virtual IP; the LB forwards requests using algorithms like round-robin, least connections, or consistent hashing.
L4 load balancers (TCP) route by IP/port; L7 load balancers (HTTP) route by URL path, headers, and cookies. Cloud providers offer managed LBs (AWS ALB/NLB, GCP LB) that integrate with health checks and auto-scaling groups.
This lesson covers LB placement in HLD, algorithm selection, health checks, SSL termination, and sticky session trade-offs.
Understanding the topic
Key concepts
- L4 (transport): fast, protocol-agnostic, good for databases proxies and gaming UDP.
- L7 (application): path-based routing, TLS termination, WAF integration.
- Algorithms: round-robin, weighted, least connections, IP hash, consistent hash (cache locality).
- Health checks: HTTP /health, TCP connect, custom — unhealthy nodes drained.
- SSL termination at LB reduces CPU on app servers; re-encrypt to backends optional.
- Active-active vs active-passive pairs for LB itself (DNS failover, anycast).
flowchart LRClient --> LB[Load Balancer]LB --> S1[Server 1]LB --> S2[Server 2]LB --> S3[Server 3]
Internal architecture
Architecture overview
flowchart LRClient --> LB[Load Balancer]LB --> S1[Server 1]LB --> S2[Server 2]LB --> S3[Server 3]
Step-by-step explanation
- Internet → DNS → Cloud LB (TLS terminate) → target group of N app instances.
- LB health check every 10s on /ready; remove instance on 3 failures.
- Auto-scaling group registers new instances with target group automatically.
- Separate internal LB for service-to-service traffic in private VPC.
- WebSocket: L7 LB with connection stickiness or dedicated gateway tier.
- Global: GeoDNS or anycast LB routes to nearest region; health-based failover.
Informative example
Spring Boot actuator health for LB probes — separate liveness from readiness:
management:endpoints:web:exposure:include: health,info,prometheusendpoint:health:probes:enabled: truegroup:liveness:include: livenessStatereadiness:include: readinessState,db,redisspring:datasource:url: jdbc:postgresql://db.internal:5432/shopdata:redis:host: redis.internal# AWS ALB target group health check path: /actuator/health/readiness
Readiness excludes dependency failure from liveness restart loop. LB should use readiness path only.
Real-world use
Real-world use cases
- E-commerce web tier during holiday traffic — horizontal pods + ALB.
- Banking API gateway cluster with weighted routing for canary releases.
- OTT API: L7 path routing /api vs /static to different target groups.
- Multi-region failover: DNS LB health checks shift traffic to secondary region.
Best practices
- Terminate TLS at LB with modern cipher policies; rotate certs via ACM/Let's Encrypt.
- Use least connections for long-lived requests; round-robin for uniform short API calls.
- Enable connection draining during deploys (deregistration delay).
- Avoid sticky sessions unless required — prefer shared Redis session store.
- Monitor LB 5xx rates and target health as primary alerts.
- Place rate limiting at edge (API gateway) in addition to LB.
Common mistakes
- Health check hits wrong port or auth-protected path — all nodes marked unhealthy.
- Sticky sessions causing uneven load during scale events.
- No LB in front of single instance — zero fault tolerance.
- Ignoring cross-AZ LB data transfer costs at scale.
- WebSocket timeouts misconfigured on L7 LB idle timeout.
Advanced interview questions
Q1BeginnerWhat does a load balancer do?
Q2BeginnerL4 vs L7 load balancer?
Q3IntermediateWhen use consistent hashing?
Q4IntermediateHow load balancers interact with auto-scaling?
Q5AdvancedDesign LB strategy for global chat API.
Summary
Load balancers enable horizontal scaling and high availability. L7 LB adds path routing, TLS termination, and HTTP-aware policies. Health checks must reflect real readiness, not just process up. Prefer stateless backends over sticky sessions. Global LB pairs with multi-region DR designs. Reverse proxy overlaps LB — clarified next lesson.