Scale & Constraints

📖 9 min read 📄 Part 2 of 10

CDN Network - Scale and Constraints

Traffic Volume

Global Request Rate

Aggregate requests: 10-50 million requests/second globally (equivalent to 10-15 trillion requests/month)
HTTP GET dominance: 95%+ of requests are GET; remaining are HEAD, OPTIONS (CORS preflight), and POST (for edge functions)
Request size distribution: Average request header ~800 bytes; average response varies by content type
Connection reuse: 70-80% of requests served over persistent HTTP/2 or HTTP/3 connections

Bandwidth Per Edge Node

Tier 1 PoPs (major metros: NYC, London, Tokyo): 1-10 Tbps aggregate capacity
Tier 2 PoPs (secondary cities): 100 Gbps - 1 Tbps
Tier 3 PoPs (smaller markets, ISP-embedded): 10-100 Gbps
Per-server throughput: 40-100 Gbps per commodity server with DPDK/kernel bypass networking
Typical utilization target: 40-60% of peak capacity to absorb traffic spikes

Total Egress

Sustained global egress: 100-500 Tbps during peak hours
Monthly data transfer: 5-20 exabytes/month for a top-tier CDN
Egress cost structure: $0.01-0.08/GB depending on region (cheapest in NA/EU, most expensive in APAC/LATAM)

Storage Requirements

Cached Content Per PoP

PoP Tier	SSD Storage	HDD Storage	Total Capacity	Unique Objects
Tier 1	500 TB SSD	5 PB HDD	~5.5 PB	50-100 billion
Tier 2	100 TB SSD	1 PB HDD	~1.1 PB	10-20 billion
Tier 3	20 TB SSD	200 TB HDD	~220 TB	2-5 billion

Hot tier (SSD): Top 5-10% of content by access frequency; serves 80-90% of requests
Warm tier (HDD): Long-tail content; serves 10-20% of requests
Memory cache: 256-512 GB RAM per server for ultra-hot objects (top 0.1%)

Origin Storage

Total origin capacity: Varies by customer; CDN doesn't own origin content
Origin shield cache: 50-200 TB per shield region (aggregates misses before hitting origin)
Typical origin data set: 1 TB - 100 TB per large customer distribution

Metadata Storage

Cache index per server: 50-100 GB in RAM (URL hash → disk location mapping)
Per-object metadata: ~200 bytes (URL hash, content hash, TTL, headers, size, timestamps)
Global configuration database: 10-50 TB (routing rules, certificates, customer configs)
DNS zone data: 5-10 GB (millions of CNAME records, routing policies)

Network Bandwidth

Inter-PoP Communication

Backbone links: 100 Gbps - 400 Gbps between major PoPs
Purpose: Cache fill from parent/peer PoPs, configuration propagation, health data
Traffic volume: 5-15% of total egress (most content served from local cache)
Latency budget: <50ms between regional PoPs; <150ms cross-continent

Origin-Pull Bandwidth

Cache miss rate: 5-15% of requests result in origin fetch
Origin pull bandwidth: 5-50 Tbps globally (aggregated across all origins)
Connection pooling: Maintain persistent connections to origins; 100-1000 connections per origin per PoP
Origin shield reduction: Reduces origin load by 60-80% by consolidating cache fills

Client-Facing Bandwidth

Last-mile delivery: Dominated by video (70-80% of bytes), images (10-15%), other (5-10%)
Peak-to-average ratio: 3-5x (evening peak vs. overnight trough)
Per-user bandwidth: 5-50 Mbps for video streaming; 1-5 Mbps for web browsing
Protocol distribution: HTTP/2 (60%), HTTP/3 QUIC (25%), HTTP/1.1 (15%)

Geographic Distribution

Points of Presence (PoPs)

Total PoPs: 200-400 globally (top CDNs like Cloudflare: 300+, Akamai: 4000+ edge servers)
Tier 1 PoPs: 20-30 (major internet exchange points)
Tier 2 PoPs: 50-100 (regional hubs)
Tier 3 PoPs: 100-300 (last-mile, ISP-embedded)

Regional Distribution

Region	PoPs	% of Traffic	Avg Latency to User
North America	60-80	30-35%	<10ms
Europe	50-70	25-30%	<15ms
Asia Pacific	40-60	20-25%	<20ms
Latin America	15-25	5-8%	<25ms
Middle East/Africa	10-20	3-5%	<30ms
Oceania	5-10	2-3%	<15ms

Edge Location Strategy

IXP co-location: Place servers at Internet Exchange Points for peering with multiple ISPs
ISP-embedded: Deploy cache servers inside ISP networks (like Netflix Open Connect)
Cloud region adjacency: PoPs near major cloud regions for fast origin pulls
Population density: Correlate PoP placement with internet user density

Content Types and Sizes

Static Assets

Content Type	Avg Size	% of Requests	% of Bandwidth	Cacheability
Images (JPEG/PNG/WebP)	50-500 KB	40-50%	15-20%	Highly cacheable
JavaScript bundles	100-500 KB	15-20%	5-10%	Cacheable with versioning
CSS files	20-100 KB	10-15%	2-5%	Cacheable with versioning
Fonts (WOFF2)	20-100 KB	5-8%	1-3%	Highly cacheable
HTML documents	10-100 KB	10-15%	2-5%	Short TTL or uncacheable

Video Streaming

Format	Segment Size	Bitrate Range	% of Bandwidth
HLS/DASH segments	2-10 MB	1-15 Mbps	60-70%
Manifests (.m3u8/.mpd)	1-50 KB	N/A	<1%
Thumbnails/previews	10-100 KB	N/A	2-3%

API Responses

JSON API responses: 1-50 KB average; cacheable for 1-60 seconds
GraphQL responses: 5-100 KB; often uncacheable due to personalization
Webhook/event data: Typically not cached; passed through edge

Cache Hit Ratios and Origin Load Impact

Target Hit Ratios by Content Type

Content Type	Target Hit Ratio	Origin Requests Saved
Static images	95-99%	95-99%
Video segments	85-95%	85-95%
CSS/JS bundles	90-98%	90-98%
HTML pages	60-80%	60-80%
API responses	40-70%	40-70%
Personalized content	0-20%	0-20%

Impact on Origin Load

Without CDN: Origin handles 50M req/s directly → requires massive origin infrastructure
With 90% hit ratio: Origin handles 5M req/s → 10x reduction in origin capacity needed
With 95% hit ratio: Origin handles 2.5M req/s → 20x reduction
Bandwidth savings: 90% hit ratio saves $10-50M/month in origin egress costs

Factors Affecting Hit Ratio

Content diversity: More unique URLs → lower hit ratio per object
TTL configuration: Longer TTLs → higher hit ratio but staleness risk
Query string handling: Ignoring irrelevant query params improves hit ratio
Geographic spread: Popular content in one region may be cold in another
Cache capacity: Larger caches hold more long-tail content

Cost Estimation

Bandwidth Costs (Monthly)

Component	Cost Range	Notes
Transit bandwidth	$0.50-2.00/Mbps/month	Varies by region and commitment
Peering (settlement-free)	$0 (capex only)	Requires traffic ratio balance
ISP-embedded delivery	$0.005-0.01/GB	Revenue share model
Cross-connect fees	$500-2000/port/month	Per IXP connection
Total bandwidth	$15-30M/month	For a top-10 CDN

Storage Costs (Monthly)

Component	Cost	Notes
SSD (hot cache)	$0.10-0.20/GB/month	NVMe drives, 3-year amortization
HDD (warm cache)	$0.02-0.05/GB/month	High-capacity drives
RAM (memory cache)	$5-8/GB/month	Server memory
Total storage	$5-10M/month	Across all PoPs

Compute at Edge (Monthly)

Component	Cost	Notes
Server hardware (amortized)	$10-15M	3-year refresh cycle
Power and cooling	$5-8M	10-20 kW per rack
Rack space / colocation	$3-5M	Per-rack fees at IXPs
Network equipment	$2-4M	Routers, switches, load balancers
Total compute	$20-32M/month

Total Cost of Ownership

Monthly operational cost: $50-80M for a global CDN
Cost per GB delivered: $0.005-0.02 (internal cost)
Customer pricing: $0.02-0.15/GB (margin depends on volume and region)
Cost per million requests: $0.50-2.00

Peak Traffic Patterns

Live Events

Major sporting events (Super Bowl, World Cup): 10-50x normal traffic spike
Product launches (Apple keynote, game releases): 5-20x spike in specific regions
Breaking news: 3-10x spike, geographically concentrated initially then spreading
Ramp-up time: Seconds to minutes; must absorb without pre-warming

Flash Crowds

Viral content: Single object goes from 0 to millions of requests/second
Social media amplification: Tweet/post drives traffic to single URL
Thundering herd on cache expiry: Many simultaneous requests for expired popular object
Mitigation: Request coalescing, stale-while-revalidate, adaptive TTLs

Diurnal Patterns

Peak hours: 7-11 PM local time (video streaming dominates)
Business hours: 9 AM - 5 PM (web/API traffic peaks)
Global smoothing: Follow-the-sun pattern means global peak is 2-3x global trough
Weekend vs weekday: Video traffic 20-40% higher on weekends

Capacity Planning for Peaks

Provisioning target: Handle 3-5x average traffic without degradation
Burst capacity: Additional 2-3x via traffic shedding and graceful degradation
Auto-scaling: Add capacity within minutes for sustained spikes
Pre-positioning: Warm caches before known events (scheduled launches, sports)

Latency Budgets

End-to-End Latency Breakdown (Cache Hit)

Component	Budget	Notes
DNS resolution	5-20ms	Anycast DNS, client-side caching
TCP + TLS handshake	10-30ms	0-RTT with TLS 1.3 / QUIC
Request to edge	1-5ms	Last-mile network
Edge processing	1-5ms	Cache lookup, header processing
Response transfer	5-50ms	Depends on object size
Total (cache hit)	20-100ms

End-to-End Latency Breakdown (Cache Miss)

Component	Budget	Notes
All of above	20-50ms	Client to edge
Edge to origin	50-200ms	Depends on origin location
Origin processing	50-500ms	Application-dependent
Origin to edge	50-200ms	Return path
Total (cache miss)	150-900ms

SLA Targets

Availability: 99.99% (52 minutes downtime/year)
P50 latency (hit): <30ms
P95 latency (hit): <75ms
P99 latency (hit): <150ms
Cache hit ratio: >90% for static content
Purge propagation: <5 seconds globally