1. Network connectivity:
- Proximity to Internet Exchange Points (IXPs)
- Available peering partners (Tier 1 ISPs, eyeball networks)
- Backbone connectivity to existing PoPs
- Latency to target user population (<10ms goal)
2. User demand analysis:
- DNS query patterns showing underserved regions
- Latency measurements from RUM (Real User Monitoring)
- Traffic volume from region exceeding threshold (>5 Gbps sustained)
- Customer requests for coverage in specific markets
3. Infrastructure availability:
- Data center space and power (minimum 2-5 MW)
- Redundant power feeds and cooling
- Physical security and compliance certifications
- Expansion capacity for 3-5 year growth
4. Cost considerations:
- Colocation costs per kW/rack
- Local transit pricing (varies 10x between regions)
- Peering availability (settlement-free vs paid)
- Labor costs for on-site maintenance
Peering Strategy
Peering types at each PoP:
- Public peering: Connect at IXP route servers (low cost, many peers)
- Private peering: Direct cross-connects to major ISPs (better performance)
- Paid transit: Backup connectivity via Tier 1 providers (Cogent, Lumen, NTT)
- ISP embedding: Place servers inside ISP networks (best latency)
Peering economics:
- Target: 80%+ traffic via settlement-free peering
- Remaining: 15% paid transit, 5% backbone to other PoPs
- Break-even: PoP becomes profitable at ~20 Gbps sustained traffic
- ROI timeline: 12-18 months for Tier 2 PoPs, 6-12 months for Tier 1
Capacity Planning Per PoP
Initial deployment (Tier 2 PoP):-4-8 racks of servers (48-96 servers)-100 Gbps aggregate network capacity-200 TB SSD + 1 PB HDD storage-2x 100GE uplinks (redundant)Growth triggers for expansion:-CPU utilization > 60% sustained-Network utilization > 50% of capacity-Storage utilization > 75%-Cache eviction rate increasing (hit ratio dropping)-P95 latency exceeding SLAScaling unit:Add capacity in "pods" of 2 racks (24 servers)-Each pod adds: ~25 Gbps throughput, 50 TB SSD, 250 TB HDD-Deployment time: 2-4 weeks (hardware procurement to production)
Cache Warming Strategies for New Edges
Proactive Warming
Strategy 1: Top-N content pre-population
- Identify top 10,000 objects by request count from nearest existing PoP
- Pre-fetch from origin or peer PoP before announcing the new edge
- Covers 60-80% of expected requests on day one
- Execution: Background job pulling content at10 Gbps for 2-4 hours
Strategy 2: Peer-assisted warming
- New PoP initially routes cache misses to nearest peer PoP (not origin)
- Peer PoP acts as temporary origin shield
- Gradually shift to direct origin pulls as local cache fills
- Timeline: 24-72 hours to reach steady-state hit ratio
Strategy 3: Traffic shadowing
- Before going live, mirror a percentage of traffic from nearby PoP
- Process requests but discard responses (just populate cache)
- Validate cache correctness before serving real users
- Duration: 4-12 hours of shadowing
Gradual Traffic Migration
Phase 1 (Day 0): Shadow traffic only, no real serving
Phase 2 (Day 1): 5% of regional traffic via weighted DNS
Phase 3 (Day 2-3): 25% of traffic, monitor hit ratio and latency
Phase 4 (Day 4-7): 50% of traffic, validate at scale
Phase 5 (Day 7+): 100% of traffic, full production
Rollback criteria:
- Cache hit ratio < 70% (expected >85% by Phase 3)
- P95 latency > 2x existing PoPs in region
- Error rate > 0.1%
- Any hardware failures during ramp
Origin Shield / Mid-Tier Caching
Architecture
Without origin shield:
Client → Edge PoP → Origin
Problem:200+ PoPs each independently fetching same content from origin
Origin load: cache_miss_rate * total_requests * num_pops
With origin shield:
Client → Edge PoP → Shield PoP → Origin
Benefit:Shield consolidates misses from all edge PoPs in a region
Origin load: cache_miss_rate * total_requests (not multiplied by num_pops)
Typical reduction: 60-80% fewer origin requests
Shield Placement
Shield regions (typically 3-8 globally):-US East (Virginia) - covers NA East + EU overflow-US West (Oregon) - covers NA West + APAC overflow-EU West (Frankfurt) - covers EU + Middle East-AP Northeast (Tokyo) - covers APAC North-AP Southeast (Singapore) - covers APAC South-SA East (Sao Paulo) - covers Latin AmericaSelection criteria:-Low latency to origin servers (most origins in cloud regions)-High bandwidth connectivity to edge PoPs-Sufficient storage for full content catalog-Redundancy: each shield has a failover shield
Shield Cache Behavior
Shield-specific optimizations:
- Larger cache capacity (10-50x edge PoP)
- Longer TTLs (can serve stale-while-revalidate to edges)
- Request coalescing: collapse concurrent misses for same object
- Negative caching: cache 404s to prevent origin hammering
- Connection pooling: maintain persistent connections to origin
Request coalescing detail:
- First request for uncached object: fetchfrom origin
- Concurrent requests for same object: queue and wait
- When origin responds: serve all queued requests from single fetch
- Prevents thundering herd on popular content expiry
- Implementation: per-URL mutex with timeout (5-10 seconds)
Consistent Hashing for Cache Distribution Within a PoP
Problem Statement
Within a PoP with100+ servers:
- Eachserver has limited storage (50-500 TB)
- Total unique content >> single server capacity
- Need to route requests to the server most likely to have the content cached
- Must handle server additions/removals gracefully
Implementation
Hash ring configuration:
- Hash function: xxHash64 (fast, good distribution)
- Virtual nodes per server: 150-200 (ensures even distribution)
- Key: SHA-256(url + vary_key) mod ring_size
- Ring size: 2^64 (full 64-bit space)
Request routing within PoP:
1. Load balancer receives request2. Compute hash of cache key (URL + relevant headers)
3. Find responsible serveron hash ring (binary search)
4. Route requestto that server5.Ifserveris down: route tonextserveron ring (replication)
Rebalancing onserver add/remove:
- Adding 1serverto100-server PoP: only 1% of objects need to move
- Removing 1server: its objects redistribute to adjacent ring nodes
- No full re-hash required (unlike modulo-based sharding)
Two-Tier Routing
Tier 1: Consistent hash determines "primary" server for an object
Tier 2: Hot objects replicated to multiple servers
Detection of hot objects:
- Request rate > 10,000 req/s forsingleobject
- Single server CPU > 80% due to one object
- Automatic promotion: replicate to3-5 servers
- Load balancer distributes hot object requests across replicas
Implementation:
- Maintain "hot object list" updated every 5 seconds
- Hot objects bypass consistent hash → round-robin across replicas
- Cool-down: remove from hot list after 60 seconds below threshold
Hot Content Handling
Viral Content Detection
Signals for hot content:
- Request rate acceleration: >100% increase in60seconds
- Absolute threshold: >50,000 req/s for single URL
- Geographic spread: requests from >10 PoPs simultaneously
- Referrer analysis: traffic from social media platforms
Detection latency: <10secondsfrom onset to detection
Mitigation Strategies
Strategy 1: Request coalescing (collapse)
- Multiple concurrent requests for same uncached object
- Only one request goes to origin
- All others wait and receive the same response
- Effective for: cache miss thundering herd
Strategy 2: Micro-caching
- Cache even "uncacheable" responses for 1-5 seconds
- Reduces origin loadby100-1000x during spikes
- Trade-off: slight staleness for massive load reduction
Strategy 3: Edge replication
- Replicate hot objects toall servers in PoP (not just hash-assigned)
- Spread load across entire PoP capacity
- Automatic: triggered when single-serverload exceeds threshold
Strategy 4: Stale-while-revalidate
- Serve expired content while fetching fresh copyin background
- Users get instant response (stale by seconds/minutes)
- Origin gets single revalidation request insteadof thundering herd
Strategy 5: Pre-positioning for known events
- Sports events, product launches: pre-warm all PoPs
- Push content to edges before event starts
- Coordinate with content providers for early accessto assets
Video Streaming Optimization
Adaptive Bitrate (ABR) Delivery
HLS/DASH segment caching:
- Manifest files (.m3u8, .mpd): cache 1-5seconds (live) or1 hour (VOD)
- Video segments (.ts, .m4s): cache for hours/days (immutable byURL)
- Segment sizes: 2-10secondsof video, 2-10 MB per segment
- Bitrate ladder: 6-8 renditions (360p to4K)
Optimization techniques:
- Predictive prefetch: pre-cache next 2-3segments based onplaybackposition
- Bitrate-aware caching: prioritize popular bitrates (720p, 1080p)
- Manifest manipulation: inject CDN-specific segment URLs at edge
- CMAF: Common Media Application Format for unified HLS+DASH
Chunked Delivery and Pre-positioning
Live streaming optimization:
- Segments available at edge within 1-2 seconds of encoding
- Push-based distribution for live content (don't wait for pull)
- Regional fanout: origin → shield → edges (tree distribution)
- Latency target: <5 seconds glass-to-glass for live
VOD optimization:
- Pre-position popular titles to all PoPs during off-peak hours
- Tiered storage: first10 minutes on SSD, rest on HDD
- Range request optimization: serve partial segments efficiently
- Byte-range coalescing: combine small range requests into larger reads
Video-Specific Caching Strategies
Cache key design for video:-Include: URL path, segment number
-Exclude: session tokens, tracking params
-Normalize: remove cache-busting params that don't affect content
Storage optimization:-Deduplication: same content at different bitrates shares base layer
-Compression: video segments already compressed, skip re-compression
- Tiered eviction: evict low-popularity bitrates first, keep 720p/1080p- Storage allocation:60-70% of edge storage dedicated to video
Challenges:-Millions of functions deployed across hundreds of PoPs-Cold start latency must be minimal (<5ms for V8 isolates)-Memory pressure from many concurrent isolates-CPU contention between compute and cache servingSolutions:-Isolate pooling: pre-warm isolates for popular functions-Tiered execution: simple functions at all edges, complex at regional-Resource quotas: per-customer CPU/memory limits-Overflow routing: redirect compute-heavy requests to compute PoPs-Auto-scaling: spin up additional isolate capacity based on demandDeployment model:-Deploy to all 200+ PoPs within 30 seconds-Canary deployment: 1% of traffic → 10% → 100%-Instant rollback: revert to previous version in <5 seconds-A/B testing: route percentage of traffic to different versions
Multi-CDN Strategies and Failover
Multi-CDN Architecture
Why multi-CDN:-Redundancy: no single CDN is a SPOF-Performance: different CDNs perform better in different regions-Cost optimization: leverage competitive pricing-Capacity: aggregate bandwidth across providers for mega-eventsImplementation approaches:1. DNS-based switching:-GeoDNS routes to best CDN per region-Health checks detect CDN outages-Failover time: 30-60 seconds (DNS TTL)2. Client-side switching:-JavaScript/player logic detects failures-Automatic retry with alternate CDN URL-Failover time: <5 seconds-Best for video players3. Origin-side routing:-Origin decides which CDN to use per request-Based on real-time performance data-Most control but adds origin complexity
Traffic Distribution Strategies
Strategy 1:Active-Active (performance-based)-Continuously measure latency/throughput per CDN per region-Route traffic to best-performing CDN for each user-Rebalance every 5-15 minutes based on measurements-Typical split: 60/40 or 70/30 between primary/secondaryStrategy 2:Active-Passive (failover)-Primary CDN handles 100% of traffic-Secondary CDN on standby (warm cache via prefetch)-Automatic failover on primary degradation-Failback after primary recovers (with validation)Strategy 3:Content-based splitting-Static assets → CDN A (best price for bandwidth)-Video streaming → CDN B (best video optimization)-API/dynamic → CDN C (best edge compute)-Each CDN optimized for its content type
Failover Detection and Response
Health monitoring:-Synthetic probes from 50+ global locations every 30 seconds-RUM (Real User Monitoring) data from actual users-Origin-side monitoring of CDN pull patterns-Third-party monitoring (Catchpoint, ThousandEyes)Failover triggers:-Availability < 99.5% over 5-minute window-P95 latency > 2x baseline for region-Error rate > 1% for 3+ consecutive minutes-Complete unreachability from 3+ probe locationsFailover execution:-Update DNS records (TTL: 30-60 seconds)-Notify operations team-Begin cache warming on failover CDN-Monitor failover CDN performance-Document incident for post-mortem
Capacity Planning and Growth
Forecasting Model
Inputs:-Historical traffic growth (typically 20-40% YoY for internet traffic)-Customer pipeline (new large customers onboarding)-Seasonal patterns (holiday shopping, summer streaming)-One-time events (Olympics, elections, product launches)Planning horizons:-Short-term (0-3 months): handle with existing capacity + burst-Medium-term (3-12 months): hardware procurement and deployment-Long-term (1-3 years): new PoP construction, technology refreshCapacity buffer:-Maintain 40-50% headroom above average utilization-Burst capacity: handle 3-5x average for 1-hour periods-Emergency capacity: graceful degradation plan for >5x spikes
Technology Refresh Cycle
Hardware lifecycle:-Servers: 3-4 year refresh cycle-Network equipment: 5-7 year lifecycle-Storage (SSD): 3-5 years (write endurance dependent)-Storage (HDD): 4-5 yearsRefresh strategy:-Rolling replacement: 25-33% of fleet per year-Performance improvement: each generation 30-50% better perf/watt-Capacity growth: refresh provides organic capacity increase-Zero-downtime: drain server, replace, re-add to pool