Scale & Constraints

📖 9 min read 📄 Part 2 of 10

CDN Network - Scale and Constraints

Traffic Volume

Global Request Rate

  • Aggregate requests: 10-50 million requests/second globally (equivalent to 10-15 trillion requests/month)
  • HTTP GET dominance: 95%+ of requests are GET; remaining are HEAD, OPTIONS (CORS preflight), and POST (for edge functions)
  • Request size distribution: Average request header ~800 bytes; average response varies by content type
  • Connection reuse: 70-80% of requests served over persistent HTTP/2 or HTTP/3 connections

Bandwidth Per Edge Node

  • Tier 1 PoPs (major metros: NYC, London, Tokyo): 1-10 Tbps aggregate capacity
  • Tier 2 PoPs (secondary cities): 100 Gbps - 1 Tbps
  • Tier 3 PoPs (smaller markets, ISP-embedded): 10-100 Gbps
  • Per-server throughput: 40-100 Gbps per commodity server with DPDK/kernel bypass networking
  • Typical utilization target: 40-60% of peak capacity to absorb traffic spikes

Total Egress

  • Sustained global egress: 100-500 Tbps during peak hours
  • Monthly data transfer: 5-20 exabytes/month for a top-tier CDN
  • Egress cost structure: $0.01-0.08/GB depending on region (cheapest in NA/EU, most expensive in APAC/LATAM)

Storage Requirements

Cached Content Per PoP

PoP Tier SSD Storage HDD Storage Total Capacity Unique Objects
Tier 1 500 TB SSD 5 PB HDD ~5.5 PB 50-100 billion
Tier 2 100 TB SSD 1 PB HDD ~1.1 PB 10-20 billion
Tier 3 20 TB SSD 200 TB HDD ~220 TB 2-5 billion
  • Hot tier (SSD): Top 5-10% of content by access frequency; serves 80-90% of requests
  • Warm tier (HDD): Long-tail content; serves 10-20% of requests
  • Memory cache: 256-512 GB RAM per server for ultra-hot objects (top 0.1%)

Origin Storage

  • Total origin capacity: Varies by customer; CDN doesn't own origin content
  • Origin shield cache: 50-200 TB per shield region (aggregates misses before hitting origin)
  • Typical origin data set: 1 TB - 100 TB per large customer distribution

Metadata Storage

  • Cache index per server: 50-100 GB in RAM (URL hash → disk location mapping)
  • Per-object metadata: ~200 bytes (URL hash, content hash, TTL, headers, size, timestamps)
  • Global configuration database: 10-50 TB (routing rules, certificates, customer configs)
  • DNS zone data: 5-10 GB (millions of CNAME records, routing policies)

Network Bandwidth

Inter-PoP Communication

  • Backbone links: 100 Gbps - 400 Gbps between major PoPs
  • Purpose: Cache fill from parent/peer PoPs, configuration propagation, health data
  • Traffic volume: 5-15% of total egress (most content served from local cache)
  • Latency budget: <50ms between regional PoPs; <150ms cross-continent

Origin-Pull Bandwidth

  • Cache miss rate: 5-15% of requests result in origin fetch
  • Origin pull bandwidth: 5-50 Tbps globally (aggregated across all origins)
  • Connection pooling: Maintain persistent connections to origins; 100-1000 connections per origin per PoP
  • Origin shield reduction: Reduces origin load by 60-80% by consolidating cache fills

Client-Facing Bandwidth

  • Last-mile delivery: Dominated by video (70-80% of bytes), images (10-15%), other (5-10%)
  • Peak-to-average ratio: 3-5x (evening peak vs. overnight trough)
  • Per-user bandwidth: 5-50 Mbps for video streaming; 1-5 Mbps for web browsing
  • Protocol distribution: HTTP/2 (60%), HTTP/3 QUIC (25%), HTTP/1.1 (15%)

Geographic Distribution

Points of Presence (PoPs)

  • Total PoPs: 200-400 globally (top CDNs like Cloudflare: 300+, Akamai: 4000+ edge servers)
  • Tier 1 PoPs: 20-30 (major internet exchange points)
  • Tier 2 PoPs: 50-100 (regional hubs)
  • Tier 3 PoPs: 100-300 (last-mile, ISP-embedded)

Regional Distribution

Region PoPs % of Traffic Avg Latency to User
North America 60-80 30-35% <10ms
Europe 50-70 25-30% <15ms
Asia Pacific 40-60 20-25% <20ms
Latin America 15-25 5-8% <25ms
Middle East/Africa 10-20 3-5% <30ms
Oceania 5-10 2-3% <15ms

Edge Location Strategy

  • IXP co-location: Place servers at Internet Exchange Points for peering with multiple ISPs
  • ISP-embedded: Deploy cache servers inside ISP networks (like Netflix Open Connect)
  • Cloud region adjacency: PoPs near major cloud regions for fast origin pulls
  • Population density: Correlate PoP placement with internet user density

Content Types and Sizes

Static Assets

Content Type Avg Size % of Requests % of Bandwidth Cacheability
Images (JPEG/PNG/WebP) 50-500 KB 40-50% 15-20% Highly cacheable
JavaScript bundles 100-500 KB 15-20% 5-10% Cacheable with versioning
CSS files 20-100 KB 10-15% 2-5% Cacheable with versioning
Fonts (WOFF2) 20-100 KB 5-8% 1-3% Highly cacheable
HTML documents 10-100 KB 10-15% 2-5% Short TTL or uncacheable

Video Streaming

Format Segment Size Bitrate Range % of Bandwidth
HLS/DASH segments 2-10 MB 1-15 Mbps 60-70%
Manifests (.m3u8/.mpd) 1-50 KB N/A <1%
Thumbnails/previews 10-100 KB N/A 2-3%

API Responses

  • JSON API responses: 1-50 KB average; cacheable for 1-60 seconds
  • GraphQL responses: 5-100 KB; often uncacheable due to personalization
  • Webhook/event data: Typically not cached; passed through edge

Cache Hit Ratios and Origin Load Impact

Target Hit Ratios by Content Type

Content Type Target Hit Ratio Origin Requests Saved
Static images 95-99% 95-99%
Video segments 85-95% 85-95%
CSS/JS bundles 90-98% 90-98%
HTML pages 60-80% 60-80%
API responses 40-70% 40-70%
Personalized content 0-20% 0-20%

Impact on Origin Load

  • Without CDN: Origin handles 50M req/s directly → requires massive origin infrastructure
  • With 90% hit ratio: Origin handles 5M req/s → 10x reduction in origin capacity needed
  • With 95% hit ratio: Origin handles 2.5M req/s → 20x reduction
  • Bandwidth savings: 90% hit ratio saves $10-50M/month in origin egress costs

Factors Affecting Hit Ratio

  • Content diversity: More unique URLs → lower hit ratio per object
  • TTL configuration: Longer TTLs → higher hit ratio but staleness risk
  • Query string handling: Ignoring irrelevant query params improves hit ratio
  • Geographic spread: Popular content in one region may be cold in another
  • Cache capacity: Larger caches hold more long-tail content

Cost Estimation

Bandwidth Costs (Monthly)

Component Cost Range Notes
Transit bandwidth $0.50-2.00/Mbps/month Varies by region and commitment
Peering (settlement-free) $0 (capex only) Requires traffic ratio balance
ISP-embedded delivery $0.005-0.01/GB Revenue share model
Cross-connect fees $500-2000/port/month Per IXP connection
Total bandwidth $15-30M/month For a top-10 CDN

Storage Costs (Monthly)

Component Cost Notes
SSD (hot cache) $0.10-0.20/GB/month NVMe drives, 3-year amortization
HDD (warm cache) $0.02-0.05/GB/month High-capacity drives
RAM (memory cache) $5-8/GB/month Server memory
Total storage $5-10M/month Across all PoPs

Compute at Edge (Monthly)

Component Cost Notes
Server hardware (amortized) $10-15M 3-year refresh cycle
Power and cooling $5-8M 10-20 kW per rack
Rack space / colocation $3-5M Per-rack fees at IXPs
Network equipment $2-4M Routers, switches, load balancers
Total compute $20-32M/month

Total Cost of Ownership

  • Monthly operational cost: $50-80M for a global CDN
  • Cost per GB delivered: $0.005-0.02 (internal cost)
  • Customer pricing: $0.02-0.15/GB (margin depends on volume and region)
  • Cost per million requests: $0.50-2.00

Peak Traffic Patterns

Live Events

  • Major sporting events (Super Bowl, World Cup): 10-50x normal traffic spike
  • Product launches (Apple keynote, game releases): 5-20x spike in specific regions
  • Breaking news: 3-10x spike, geographically concentrated initially then spreading
  • Ramp-up time: Seconds to minutes; must absorb without pre-warming

Flash Crowds

  • Viral content: Single object goes from 0 to millions of requests/second
  • Social media amplification: Tweet/post drives traffic to single URL
  • Thundering herd on cache expiry: Many simultaneous requests for expired popular object
  • Mitigation: Request coalescing, stale-while-revalidate, adaptive TTLs

Diurnal Patterns

  • Peak hours: 7-11 PM local time (video streaming dominates)
  • Business hours: 9 AM - 5 PM (web/API traffic peaks)
  • Global smoothing: Follow-the-sun pattern means global peak is 2-3x global trough
  • Weekend vs weekday: Video traffic 20-40% higher on weekends

Capacity Planning for Peaks

  • Provisioning target: Handle 3-5x average traffic without degradation
  • Burst capacity: Additional 2-3x via traffic shedding and graceful degradation
  • Auto-scaling: Add capacity within minutes for sustained spikes
  • Pre-positioning: Warm caches before known events (scheduled launches, sports)

Latency Budgets

End-to-End Latency Breakdown (Cache Hit)

Component Budget Notes
DNS resolution 5-20ms Anycast DNS, client-side caching
TCP + TLS handshake 10-30ms 0-RTT with TLS 1.3 / QUIC
Request to edge 1-5ms Last-mile network
Edge processing 1-5ms Cache lookup, header processing
Response transfer 5-50ms Depends on object size
Total (cache hit) 20-100ms

End-to-End Latency Breakdown (Cache Miss)

Component Budget Notes
All of above 20-50ms Client to edge
Edge to origin 50-200ms Depends on origin location
Origin processing 50-500ms Application-dependent
Origin to edge 50-200ms Return path
Total (cache miss) 150-900ms

SLA Targets

  • Availability: 99.99% (52 minutes downtime/year)
  • P50 latency (hit): <30ms
  • P95 latency (hit): <75ms
  • P99 latency (hit): <150ms
  • Cache hit ratio: >90% for static content
  • Purge propagation: <5 seconds globally