Web Cache - Scale and Constraints

Traffic Estimation

Request Volume

Total Requests: 100 billion requests per day
Cache Hits: 90 billion (90% hit rate)
Cache Misses: 10 billion (10% miss rate)
Peak Traffic: 3x average during peak hours
Requests per Second: ~1.16M average, 3.5M peak
Geographic Distribution: 40% US, 30% EU, 20% Asia, 10% Other

Request Types

Static Assets: 60% (images, CSS, JS, fonts)
HTML Pages: 20% (full page caching)
API Responses: 15% (JSON/XML data)
Video Content: 5% (streaming segments)

User Metrics

Active Users: 500 million daily active users
Requests per User: 200 requests per day average
Session Duration: 15 minutes average
Concurrent Users: 10 million peak concurrent

Storage Capacity Planning

Content Size Distribution

Tiny (<10KB): 40% - HTML, CSS, JS, JSON (avg 5KB)
Small (10KB-100KB): 30% - Images, small files (avg 50KB)
Medium (100KB-1MB): 20% - Large images, documents (avg 500KB)
Large (1MB-10MB): 8% - Videos, PDFs (avg 5MB)
Very Large (>10MB): 2% - Large videos, downloads (avg 50MB)

Storage Requirements

Calculation:
- Unique URLs: 10 billion
- Average object size: 200KB (weighted average)
- Total raw data: 10B × 200KB = 2PB
- Compression ratio: 2x (gzip/brotli)
- Compressed data: 1PB
- Hot data (30 days): 300TB
- Warm data (90 days): 700TB
- Total active cache: 1PB

Per-Node Storage (100 nodes):
- Hot tier (NVMe): 3TB per node
- Warm tier (SSD): 7TB per node
- Total: 10TB per node

Cache Growth

Daily New Content: 100GB per day
Monthly Growth: 3TB per month
Annual Growth: 36TB per year
Retention Policy: 90 days for most content
Long-term Storage: Archive old content to object storage

Performance Requirements

Latency Targets

Memory Cache Hit: <500μs (P99)
SSD Cache Hit: <5ms (P99)
Cache Miss: <100ms (P99, includes origin fetch)
Cache Write: <10ms (async, non-blocking)
Invalidation: <1s propagation across cluster
Health Check: <100ms response

Throughput Requirements

Per-Node Read: 100K requests/sec
Per-Node Write: 10K cache updates/sec
Cluster Read: 10M requests/sec (100 nodes)
Cluster Write: 1M cache updates/sec
Origin Requests: 1M requests/sec (10% miss rate)
Invalidations: 10K/sec cluster-wide

Connection Handling

Concurrent Connections: 100K per node
New Connections: 10K/sec per node
Keep-Alive: 60 seconds default
Connection Pool: 1K connections to origin per node
WebSocket Support: 10K concurrent WebSocket connections

Network Bandwidth

Inbound Traffic

Cache Misses: 10B requests/day × 200KB = 2PB/day
Average Inbound: 23GB/sec
Peak Inbound: 69GB/sec (3x average)
Per-Node Inbound: 230MB/sec average, 690MB/sec peak

Outbound Traffic

Cache Hits: 90B requests/day × 200KB = 18PB/day
Cache Misses: 10B requests/day × 200KB = 2PB/day
Total Outbound: 20PB/day
Average Outbound: 231GB/sec
Peak Outbound: 693GB/sec
Per-Node Outbound: 2.3GB/sec average, 6.9GB/sec peak

Internal Traffic

Cache Synchronization: 1TB/day (invalidations, metadata)
Health Checks: 100MB/day
Monitoring: 10GB/day (metrics, logs)
Total Internal: ~1.1TB/day = 12.7MB/sec

Memory Requirements

Per-Node Memory Allocation

Hot Cache: 128GB (most frequently accessed)
Index/Metadata: 16GB (cache keys, headers, metadata)
Connection Buffers: 8GB (100K connections × 80KB)
Request Processing: 4GB (request parsing, response building)
Operating System: 4GB
Total per Node: 160GB RAM

Cluster Memory

Total Nodes: 100 nodes
Total Cluster Memory: 16TB RAM
Effective Cache: 12.8TB (hot data)
Cache Hit Rate: 95% for hot data
Memory Efficiency: 80% utilization

Memory Distribution

L1 Cache (Memory): 128GB per node, <1ms latency
L2 Cache (SSD): 10TB per node, <5ms latency
L3 (Origin): Unlimited, <100ms latency

Compute Resources

CPU Requirements

Per-Node CPU: 32 cores (2 × 16-core processors)
CPU Utilization: 50% average, 80% peak
Request Processing: 0.1ms CPU per request
Compression: 10% CPU for gzip/brotli
TLS Termination: 15% CPU for HTTPS
Cache Management: 5% CPU for eviction, invalidation

Cluster Compute

Total Nodes: 100 nodes
Total CPU Cores: 3,200 cores
Compute Capacity: 1.16M req/sec at 50% CPU
Headroom: 50% capacity for growth and spikes

Disk I/O Requirements

Disk Specifications

Hot Tier: NVMe SSD, 3TB per node
- Read IOPS: 500K IOPS
- Write IOPS: 100K IOPS
- Sequential Read: 3GB/sec
- Sequential Write: 2GB/sec
Warm Tier: SATA SSD, 7TB per node
- Read IOPS: 100K IOPS
- Write IOPS: 50K IOPS
- Sequential Read: 500MB/sec
- Sequential Write: 400MB/sec

I/O Patterns

Read I/O: 10K IOPS per node (10% miss rate)
Write I/O: 5K IOPS per node (cache updates)
Sequential Reads: 500MB/sec (large file streaming)
Sequential Writes: 200MB/sec (cache population)

Cache Hit Rate Analysis

Hit Rate by Content Type

Static Assets: 95% hit rate (CSS, JS, images)
HTML Pages: 85% hit rate (with personalization)
API Responses: 70% hit rate (shorter TTL)
Video Content: 90% hit rate (popular content)
Overall: 90% weighted average hit rate

Hit Rate by Time

Peak Hours: 92% hit rate (hot content in cache)
Off-Peak: 88% hit rate (some cache eviction)
After Deployment: 60% hit rate (cold cache)
Steady State: 90% hit rate (warm cache)

Factors Affecting Hit Rate

TTL Configuration: Longer TTL = higher hit rate
Cache Size: Larger cache = higher hit rate
Traffic Patterns: Predictable traffic = higher hit rate
Content Popularity: Zipf distribution (80/20 rule)
Invalidation Frequency: More invalidations = lower hit rate

Geographic Distribution

Regional Deployment

Region: US-East
- Nodes: 30
- Traffic: 40% of total
- Latency: <5ms local, <50ms cross-region

Region: US-West
- Nodes: 20
- Traffic: 20% of total
- Latency: <5ms local, <70ms cross-region

Region: EU-West
- Nodes: 25
- Traffic: 30% of total
- Latency: <5ms local, <100ms cross-US

Region: Asia-Pacific
- Nodes: 15
- Traffic: 15% of total
- Latency: <5ms local, <150ms cross-US

Region: Other
- Nodes: 10
- Traffic: 5% of total
- Latency: <10ms local, <200ms cross-region

Cross-Region Traffic

Cache Synchronization: Invalidations propagated globally
Origin Failover: Route to nearest healthy origin
Content Replication: Popular content replicated across regions
Latency Impact: <100ms for cross-region cache misses

Cost Estimation

Infrastructure Costs (Monthly)

Compute: 100 nodes × $500 = $50,000
Memory: 16TB RAM included in compute
Storage: 1PB × $0.10/GB = $100,000
Network: 600TB/month × $0.05/GB = $30,000
Load Balancers: $5,000
Total Infrastructure: $185,000/month

Operational Costs (Monthly)

Monitoring: $3,000
Logging: $2,000
Support: $10,000
Personnel: 3 engineers × $15,000 = $45,000
Total Operational: $60,000/month

Total Cost of Ownership

Monthly Total: $245,000
Annual Total: $2,940,000
Cost per Request: $0.0000000245 per request
Cost per GB Served: $0.012 per GB

Cost Savings

Origin Infrastructure Savings: $500,000/month (85% offload)
Bandwidth Savings: $200,000/month (80% reduction)
Total Savings: $700,000/month
Net Savings: $455,000/month ($5.46M/year)
ROI: 186% return on investment

This scale analysis provides the foundation for capacity planning, infrastructure provisioning, and cost optimization for a production-grade distributed web caching system.