Web Cache - Interview Tips

Interview Approach

Initial Questions (5 minutes)

Scale: How many requests per second?
Content Types: Static assets, HTML, API responses?
Geographic Distribution: Single region or global?
Consistency: How fresh must content be?
Cache Hit Rate: What's the target hit rate?

Design Progression

Step 1: High-Level Architecture (5 min)

Clients → Load Balancer → Cache Nodes → Origin Servers

Step 2: Cache Storage (5 min)

Multi-tier: Memory (L1) + SSD (L2)
Eviction policy: LRU
TTL-based expiration

Step 3: Cache Key Generation (5 min)

URL + query parameters
Vary headers (Accept-Encoding, Accept-Language)
Custom cache keys

Step 4: Cache Invalidation (5 min)

TTL expiration
Explicit purge (URL, pattern, tag)
Conditional requests (ETag, If-Modified-Since)

Step 5: Scaling (5 min)

Horizontal scaling (add nodes)
Geographic distribution
Load balancing strategies

Step 6: Failure Handling (5 min)

Serve stale on origin failure
Circuit breaker
Health checks and failover

Key Topics to Cover

Must Cover

✅ Cache storage (memory + disk)
✅ Cache key generation
✅ Eviction policies (LRU)
✅ TTL and expiration
✅ Cache invalidation

Should Cover

✅ Cache stampede prevention
✅ Compression
✅ Vary header handling
✅ Monitoring and metrics

Nice to Cover

✅ CDN integration
✅ Security (cache poisoning)
✅ Cost optimization

Common Pitfalls

❌ Not discussing cache stampede ✅ Explain request coalescing and stale-while-revalidate

❌ Ignoring cache invalidation ✅ Discuss multiple invalidation strategies

❌ Forgetting about Vary headers ✅ Explain how to cache multiple versions

❌ Not considering origin failures ✅ Discuss serving stale content

Talking Points

Cache Hit Rate

"We target 90%+ hit rate for static content. This is achieved through:

Appropriate TTLs (longer for static, shorter for dynamic)
Large cache size (more content cached)
Cache warming for popular content
Request coalescing to prevent stampede"

Cache Stampede

"When cache expires and many requests arrive simultaneously, we use:

Request coalescing: First request fetches, others wait
Stale-while-revalidate: Serve stale, refresh async
Probabilistic early expiration: Refresh before TTL"

Scaling

"We scale horizontally by adding cache nodes. Each node is stateless, so no data migration needed. Load balancer distributes requests using consistent hashing or geographic routing."

Time Management

45-Minute Interview

0-5 min: Clarify requirements
5-15 min: High-level architecture
15-25 min: Deep dive (storage, invalidation)
25-35 min: Scaling and failure handling
35-40 min: Follow-up questions
40-45 min: Wrap-up

Practice Recommendations

Draw diagrams quickly: Practice cache architecture diagrams
Know real systems: Study Varnish, Nginx, CloudFlare
Understand HTTP caching: Cache-Control, ETag, Vary headers
Practice explaining trade-offs: Memory vs disk, TTL vs invalidation

This guide helps you approach web cache design interviews with confidence and structure.