CDN Network - Scale and Constraints
Traffic Volume
Global Request Rate
- Aggregate requests: 10-50 million requests/second globally (equivalent to 10-15 trillion requests/month)
- HTTP GET dominance: 95%+ of requests are GET; remaining are HEAD, OPTIONS (CORS preflight), and POST (for edge functions)
- Request size distribution: Average request header ~800 bytes; average response varies by content type
- Connection reuse: 70-80% of requests served over persistent HTTP/2 or HTTP/3 connections
Bandwidth Per Edge Node
- Tier 1 PoPs (major metros: NYC, London, Tokyo): 1-10 Tbps aggregate capacity
- Tier 2 PoPs (secondary cities): 100 Gbps - 1 Tbps
- Tier 3 PoPs (smaller markets, ISP-embedded): 10-100 Gbps
- Per-server throughput: 40-100 Gbps per commodity server with DPDK/kernel bypass networking
- Typical utilization target: 40-60% of peak capacity to absorb traffic spikes
Total Egress
- Sustained global egress: 100-500 Tbps during peak hours
- Monthly data transfer: 5-20 exabytes/month for a top-tier CDN
- Egress cost structure: $0.01-0.08/GB depending on region (cheapest in NA/EU, most expensive in APAC/LATAM)
Storage Requirements
Cached Content Per PoP
| PoP Tier | SSD Storage | HDD Storage | Total Capacity | Unique Objects |
|---|---|---|---|---|
| Tier 1 | 500 TB SSD | 5 PB HDD | ~5.5 PB | 50-100 billion |
| Tier 2 | 100 TB SSD | 1 PB HDD | ~1.1 PB | 10-20 billion |
| Tier 3 | 20 TB SSD | 200 TB HDD | ~220 TB | 2-5 billion |
- Hot tier (SSD): Top 5-10% of content by access frequency; serves 80-90% of requests
- Warm tier (HDD): Long-tail content; serves 10-20% of requests
- Memory cache: 256-512 GB RAM per server for ultra-hot objects (top 0.1%)
Origin Storage
- Total origin capacity: Varies by customer; CDN doesn't own origin content
- Origin shield cache: 50-200 TB per shield region (aggregates misses before hitting origin)
- Typical origin data set: 1 TB - 100 TB per large customer distribution
Metadata Storage
- Cache index per server: 50-100 GB in RAM (URL hash → disk location mapping)
- Per-object metadata: ~200 bytes (URL hash, content hash, TTL, headers, size, timestamps)
- Global configuration database: 10-50 TB (routing rules, certificates, customer configs)
- DNS zone data: 5-10 GB (millions of CNAME records, routing policies)
Network Bandwidth
Inter-PoP Communication
- Backbone links: 100 Gbps - 400 Gbps between major PoPs
- Purpose: Cache fill from parent/peer PoPs, configuration propagation, health data
- Traffic volume: 5-15% of total egress (most content served from local cache)
- Latency budget: <50ms between regional PoPs; <150ms cross-continent
Origin-Pull Bandwidth
- Cache miss rate: 5-15% of requests result in origin fetch
- Origin pull bandwidth: 5-50 Tbps globally (aggregated across all origins)
- Connection pooling: Maintain persistent connections to origins; 100-1000 connections per origin per PoP
- Origin shield reduction: Reduces origin load by 60-80% by consolidating cache fills
Client-Facing Bandwidth
- Last-mile delivery: Dominated by video (70-80% of bytes), images (10-15%), other (5-10%)
- Peak-to-average ratio: 3-5x (evening peak vs. overnight trough)
- Per-user bandwidth: 5-50 Mbps for video streaming; 1-5 Mbps for web browsing
- Protocol distribution: HTTP/2 (60%), HTTP/3 QUIC (25%), HTTP/1.1 (15%)
Geographic Distribution
Points of Presence (PoPs)
- Total PoPs: 200-400 globally (top CDNs like Cloudflare: 300+, Akamai: 4000+ edge servers)
- Tier 1 PoPs: 20-30 (major internet exchange points)
- Tier 2 PoPs: 50-100 (regional hubs)
- Tier 3 PoPs: 100-300 (last-mile, ISP-embedded)
Regional Distribution
| Region | PoPs | % of Traffic | Avg Latency to User |
|---|---|---|---|
| North America | 60-80 | 30-35% | <10ms |
| Europe | 50-70 | 25-30% | <15ms |
| Asia Pacific | 40-60 | 20-25% | <20ms |
| Latin America | 15-25 | 5-8% | <25ms |
| Middle East/Africa | 10-20 | 3-5% | <30ms |
| Oceania | 5-10 | 2-3% | <15ms |
Edge Location Strategy
- IXP co-location: Place servers at Internet Exchange Points for peering with multiple ISPs
- ISP-embedded: Deploy cache servers inside ISP networks (like Netflix Open Connect)
- Cloud region adjacency: PoPs near major cloud regions for fast origin pulls
- Population density: Correlate PoP placement with internet user density
Content Types and Sizes
Static Assets
| Content Type | Avg Size | % of Requests | % of Bandwidth | Cacheability |
|---|---|---|---|---|
| Images (JPEG/PNG/WebP) | 50-500 KB | 40-50% | 15-20% | Highly cacheable |
| JavaScript bundles | 100-500 KB | 15-20% | 5-10% | Cacheable with versioning |
| CSS files | 20-100 KB | 10-15% | 2-5% | Cacheable with versioning |
| Fonts (WOFF2) | 20-100 KB | 5-8% | 1-3% | Highly cacheable |
| HTML documents | 10-100 KB | 10-15% | 2-5% | Short TTL or uncacheable |
Video Streaming
| Format | Segment Size | Bitrate Range | % of Bandwidth |
|---|---|---|---|
| HLS/DASH segments | 2-10 MB | 1-15 Mbps | 60-70% |
| Manifests (.m3u8/.mpd) | 1-50 KB | N/A | <1% |
| Thumbnails/previews | 10-100 KB | N/A | 2-3% |
API Responses
- JSON API responses: 1-50 KB average; cacheable for 1-60 seconds
- GraphQL responses: 5-100 KB; often uncacheable due to personalization
- Webhook/event data: Typically not cached; passed through edge
Cache Hit Ratios and Origin Load Impact
Target Hit Ratios by Content Type
| Content Type | Target Hit Ratio | Origin Requests Saved |
|---|---|---|
| Static images | 95-99% | 95-99% |
| Video segments | 85-95% | 85-95% |
| CSS/JS bundles | 90-98% | 90-98% |
| HTML pages | 60-80% | 60-80% |
| API responses | 40-70% | 40-70% |
| Personalized content | 0-20% | 0-20% |
Impact on Origin Load
- Without CDN: Origin handles 50M req/s directly → requires massive origin infrastructure
- With 90% hit ratio: Origin handles 5M req/s → 10x reduction in origin capacity needed
- With 95% hit ratio: Origin handles 2.5M req/s → 20x reduction
- Bandwidth savings: 90% hit ratio saves $10-50M/month in origin egress costs
Factors Affecting Hit Ratio
- Content diversity: More unique URLs → lower hit ratio per object
- TTL configuration: Longer TTLs → higher hit ratio but staleness risk
- Query string handling: Ignoring irrelevant query params improves hit ratio
- Geographic spread: Popular content in one region may be cold in another
- Cache capacity: Larger caches hold more long-tail content
Cost Estimation
Bandwidth Costs (Monthly)
| Component | Cost Range | Notes |
|---|---|---|
| Transit bandwidth | $0.50-2.00/Mbps/month | Varies by region and commitment |
| Peering (settlement-free) | $0 (capex only) | Requires traffic ratio balance |
| ISP-embedded delivery | $0.005-0.01/GB | Revenue share model |
| Cross-connect fees | $500-2000/port/month | Per IXP connection |
| Total bandwidth | $15-30M/month | For a top-10 CDN |
Storage Costs (Monthly)
| Component | Cost | Notes |
|---|---|---|
| SSD (hot cache) | $0.10-0.20/GB/month | NVMe drives, 3-year amortization |
| HDD (warm cache) | $0.02-0.05/GB/month | High-capacity drives |
| RAM (memory cache) | $5-8/GB/month | Server memory |
| Total storage | $5-10M/month | Across all PoPs |
Compute at Edge (Monthly)
| Component | Cost | Notes |
|---|---|---|
| Server hardware (amortized) | $10-15M | 3-year refresh cycle |
| Power and cooling | $5-8M | 10-20 kW per rack |
| Rack space / colocation | $3-5M | Per-rack fees at IXPs |
| Network equipment | $2-4M | Routers, switches, load balancers |
| Total compute | $20-32M/month |
Total Cost of Ownership
- Monthly operational cost: $50-80M for a global CDN
- Cost per GB delivered: $0.005-0.02 (internal cost)
- Customer pricing: $0.02-0.15/GB (margin depends on volume and region)
- Cost per million requests: $0.50-2.00
Peak Traffic Patterns
Live Events
- Major sporting events (Super Bowl, World Cup): 10-50x normal traffic spike
- Product launches (Apple keynote, game releases): 5-20x spike in specific regions
- Breaking news: 3-10x spike, geographically concentrated initially then spreading
- Ramp-up time: Seconds to minutes; must absorb without pre-warming
Flash Crowds
- Viral content: Single object goes from 0 to millions of requests/second
- Social media amplification: Tweet/post drives traffic to single URL
- Thundering herd on cache expiry: Many simultaneous requests for expired popular object
- Mitigation: Request coalescing, stale-while-revalidate, adaptive TTLs
Diurnal Patterns
- Peak hours: 7-11 PM local time (video streaming dominates)
- Business hours: 9 AM - 5 PM (web/API traffic peaks)
- Global smoothing: Follow-the-sun pattern means global peak is 2-3x global trough
- Weekend vs weekday: Video traffic 20-40% higher on weekends
Capacity Planning for Peaks
- Provisioning target: Handle 3-5x average traffic without degradation
- Burst capacity: Additional 2-3x via traffic shedding and graceful degradation
- Auto-scaling: Add capacity within minutes for sustained spikes
- Pre-positioning: Warm caches before known events (scheduled launches, sports)
Latency Budgets
End-to-End Latency Breakdown (Cache Hit)
| Component | Budget | Notes |
|---|---|---|
| DNS resolution | 5-20ms | Anycast DNS, client-side caching |
| TCP + TLS handshake | 10-30ms | 0-RTT with TLS 1.3 / QUIC |
| Request to edge | 1-5ms | Last-mile network |
| Edge processing | 1-5ms | Cache lookup, header processing |
| Response transfer | 5-50ms | Depends on object size |
| Total (cache hit) | 20-100ms |
End-to-End Latency Breakdown (Cache Miss)
| Component | Budget | Notes |
|---|---|---|
| All of above | 20-50ms | Client to edge |
| Edge to origin | 50-200ms | Depends on origin location |
| Origin processing | 50-500ms | Application-dependent |
| Origin to edge | 50-200ms | Return path |
| Total (cache miss) | 150-900ms |
SLA Targets
- Availability: 99.99% (52 minutes downtime/year)
- P50 latency (hit): <30ms
- P95 latency (hit): <75ms
- P99 latency (hit): <150ms
- Cache hit ratio: >90% for static content
- Purge propagation: <5 seconds globally