How it works:
1. Client requests content from CDN edge
2. Edge checks local cache
3. On cache miss: edge fetches from origin server
4. Edge caches response and serves to client
5. Subsequent requests served from cache until TTL expires
Advantages:
+ Origin is always the source of truth
+ No manual content management required
+ Automatic cache population based on demand
+ Storage efficient: only caches what users actually request
+ Simple operational model for content publishers
+ Works well with dynamic content (short TTLs)
Disadvantages:
- First request for any object is slow (origin round-trip)
- Cache miss storms on TTL expiry for popular content
- Origin must handle burst of cache-fill requests
- Cold cache after PoP restart or new PoP deployment
- Unpredictable origin load patterns
Best for:
- Websites with large content catalogs (long tail)
- Content that changes frequently
- Organizations without dedicated CDN operations teams
- Use cases where freshness is more important than latency
Real-world examples: CloudFront, Cloudflare, Fastly (default mode)
Push CDN (Origin-Push)
How it works:
1. Content publisher uploads content directly to CDN storage
2. CDN distributes content to edge locations proactively
3. Client requests are always served from cache
4. Publisher responsible for updating/invalidating content
Advantages:
+ Every request is a cache hit (no cold start)
+ Predictable, consistent latency for all users
+ Origin servers not needed for serving (CDN is the origin)
+ Better for large files (pre-positioned, no timeout risk)
+ Controlled bandwidth usage (upload during off-peak)
Disadvantages:
- Publisher must manage content lifecycle explicitly
- Storage costs for pre-positioning across all PoPs
- Stale content risk if invalidation fails
- Complex workflow for content updates
- Doesn't scale well for millions of unique URLs
- Wasted storage for content nobody requests
Best for:
- Video on demand (pre-position popular titles)
- Software distribution (OS updates, game patches)
- Static sites with known, finite content sets
- Live event preparation (pre-warm before broadcast)
Real-world examples: Netflix Open Connect, Akamai NetStorage
Hybrid Approach (Production Reality)
Most production CDNs use a hybrid model:
- Pull for general web content (images, CSS, JS, API responses)
- Push for known high-demand content (video catalog, software updates)
- Pre-warm (push) for anticipated traffic (product launches, events)
- Pull with origin shield to reduce origin load
Decision matrix:
| Factor | Use Pull | Use Push |
|---------------------|-----------------|-------------------|
| Content volume | >1M unique URLs | <100K unique URLs |
| Update frequency | Frequent | Infrequent |
| Access pattern | Long tail | Head-heavy |
| File size | <10 MB | >100 MB |
| Latency tolerance | First-hit delay OK | Zero delay required |
| Operational model | Hands-off | Active management |
Anycast vs GeoDNS Routing
Anycast Routing
How it works:
- Same IP address announced via BGP from every PoP
- Internet routing (BGP) naturally directs packets to nearest PoP
- No DNS-level routing decisions needed
- Single IP serves global traffic
Technical implementation:
- Each PoP announces same /24 or /48 prefix
- BGP path selection chooses shortest AS path
- Failover: withdraw BGP announcement from unhealthy PoP
- Traffic shifts to next-nearest PoP automatically
Advantages:
+ Single IP address simplifies DNS configuration
+ Automatic failover via BGP (no DNS TTL delay)
+ Inherent DDoS resilience (attack distributed across PoPs)
+ No DNS resolution latency (IP is cached/hardcoded)
+ Works for UDP protocols (DNS, QUIC) where TCP state isn't an issue
+ Failover time: 30-90 seconds (BGP convergence)
Disadvantages:
- Limited control over routing decisions
- BGP routing may not choose lowest-latency path
- TCP connection issues during BGP route changes (RST)
- Cannot route based on content type or customer
- Debugging routing issues is complex
- Some ISPs have suboptimal BGP policies
- Cannot easily do weighted traffic splitting
Mitigation for TCP issues:
- Use QUIC/HTTP3 (connection migration handles route changes)
- Implement connection draining before BGP withdrawal
- Use ECMP (Equal-Cost Multi-Path) for gradual shifts
GeoDNS (Geographic DNS Routing)
How it works:
- DNS resolver determines client location (via IP geolocation or EDNS Client Subnet)
- Returns IP address of nearest/best PoP for that location
- Different clients get different IP addresses
- Each PoP has unique IP addresses
Technical implementation:
- Authoritative DNS servers with geolocation database
- EDNS Client Subnet (ECS) for resolver-level accuracy
- Health checks determine which PoPs are available
- Weighted responses for load balancing within a region
Advantages:
+ Fine-grained control over traffic routing
+ Can route based on customer, content type, or load
+ Weighted routing for gradual traffic shifts
+ Easy A/B testing and canary deployments
+ Can implement latency-based routing (not just geo)
+ Supports complex failover policies
+ Per-customer routing decisions possible
Disadvantages:
- DNS TTL creates failover delay (30-300 seconds)
- DNS caching at resolvers may serve stale records
- EDNS Client Subnet not universally supported
- More complex DNS infrastructure required
- Multiple IP addresses to manage and monitor
- DNS resolution adds latency to first request
- Resolver location != user location (VPNs, public DNS)
Accuracy challenges:
- Google Public DNS (8.8.8.8): user may be far from resolver
- Without ECS: route based on resolver location, not user
- VPN users: appear in VPN exit location, not actual location
- Mobile users: IP geolocation less accurate
Comparison Matrix
Factor
Anycast
GeoDNS
Failover speed
30-90s (BGP)
30-300s (DNS TTL)
Routing accuracy
Network-level
Application-level
DDoS resilience
Excellent (distributed)
Good (can be targeted)
Operational complexity
High (BGP expertise)
Medium (DNS management)
Granularity of control
Low
High
Protocol support
All (L3 routing)
DNS-dependent protocols
TCP connection stability
Risk during failover
Stable (same IP)
Cost
Higher (BGP transit)
Lower (DNS only)
Production recommendation:
Use both together (industry standard):-Anycast for DNS resolution itself (fast, resilient DNS)-GeoDNS logic within anycast DNS to return best edge IP-Anycast for HTTP/3 QUIC traffic (connection migration)-GeoDNS for HTTP/1.1 and HTTP/2 (TCP stability)
Cache-Everything vs Selective Caching
Cache-Everything Approach
Philosophy: Cache all responses bydefault, opt-out for specific paths
Implementation:
- Default TTL for all responses (e.g., 1 hour)
- Override with Cache-Control headers from origin
- Cache even POST responses if safe (idempotent APIs)
- Cache error responses (negative caching) withshort TTL
Advantages:
+ Maximum origin offload
+ Simple mental model ("everything is cached")
+ Catches cacheable content that wasn't explicitly marked
+ Better performance for forgotten/misconfigured resources
+ Reduces origin infrastructure costs significantly
Disadvantages:
- Risk of caching personalized/sensitive content
- Stale data for dynamic content without proper headers
- Cache poisoning risk if cache key isn't comprehensive
- Debugging issues: "why am I seeing old content?"
- Storage waste for truly uncacheable content
- Privacy concerns (cached authenticated responses)
Safeguards needed:
- Never cache responses withSet-Cookie headers
- Never cache responses to requests with Authorization header (unless explicit)
- Strip/ignore query params that don't affect response
- Implement cache tags for targeted invalidation
- Monitor for accidental PII caching
Selective Caching Approach
Philosophy: Only cache content explicitly marked as cacheable
Implementation:
- Require explicit Cache-Control: public, max-age=N
- Pass through anything without cache headers
- Whitelist specific path patterns for caching
- Default behavior: proxy without caching
Advantages:
+ No risk of caching sensitive/personalized content
+ Origin has full control over what's cached
+ Simpler debugging (cache behavior isexplicit)
+ No stale data surprises
+ Better for compliance-sensitive applications
Disadvantages:
- Lower cache hit ratio (many cacheable things not marked)
- Higher origin load (more pass-through requests)
- Requires origin developers toset proper headers
- Missed optimization opportunities
- More expensive to operate (more origin capacity needed)
Whento use:
- Financial/healthcare applications (compliance requirements)
- Highly personalized content (e-commerce with user context)
- APIs with authentication (risk of response leakage)
- Early-stage products (before caching strategy is mature)
How it works:
- Each cached object has an expiration time
- After TTL expires, next request triggers revalidation
- Origin confirms freshness (304) or sends new content (200)
- No active invalidation needed
Advantages:
+ Simple to implement and understand
+ No invalidation infrastructure needed
+ Self-healing: stale content eventually refreshes
+ Predictable origin load (revalidation spread over time)
+ Works without any coordination between systems
Disadvantages:
- Content can be stale for up to TTL duration
- Short TTLs increase origin load
- Long TTLs risk serving outdated content
- No way to force immediate update
- TTL is a guess (how long will content be valid?)
- Thundering herd on popular content TTL expiry
Optimization: stale-while-revalidate
- Serve stale content immediately
- Revalidate in background asynchronously
- User gets fast response, content refreshes behind the scenes
- Eliminates latency penalty of revalidation
Event-Based Invalidation (Purge)
How it works:
- Content cached withlong TTL (hours/days/forever)
- When content changes, publish invalidation event
- CDN purges affected objects from all edges
- Next request fetches fresh content from origin
Advantages:
+ Content always fresh (purge on change)
+ Can use very long TTLs (better hit ratio)
+ Immediate consistency when needed
+ Precise control over what's invalidated
+ Lower origin load (fewer revalidation requests)
Disadvantages:
- Requires invalidation infrastructure (pub/sub, queues)
- Purge propagation takes time (1-30 seconds globally)
- Brief inconsistency window during propagation
- Complexity: must track what to purge when data changes
- Purge storms can overwhelm the system
- Risk of over-purging (invalidating too much)
- Cost: purge APIs often have rate limits and charges
Implementation patterns:
- Webhook on CMS publish → trigger CDN purge API
- Database change data capture → purge affected URLs
- Cache tags: tag objects with logical groups, purge by tag
- Surrogate keys: purge all objects with a given key
Comparison
Factor
TTL-Based
Event-Based
Freshness guarantee
Eventual (within TTL)
Near-immediate
Implementation complexity
Low
High
Origin load
Higher (revalidation)
Lower (long TTLs)
Consistency
Weak
Strong (after propagation)
Operational overhead
None
Purge infrastructure
Best for
Slowly changing content
Frequently updated content
Failure mode
Stale content
Missing purge = stale
Production recommendation:
Use both together:-TTL as safety net (content eventually refreshes even if purge fails)-Event-based purge for immediate freshness on critical updates-stale-while-revalidate for non-critical content-Cache tags for efficient bulk invalidationExample:E-commerce product page-TTL: 1 hour (safety net)-Purge on: price change, stock change, description update-stale-while-revalidate: 5 minutes (non-critical updates)-Result: Usually fresh within seconds, guaranteed fresh within 1 hour
Single-Tier vs Multi-Tier Caching Hierarchy
Single-Tier (Edge Only)
Architecture: Client → Edge PoP → Origin
Advantages:
+ Simplest architecture+ Lowest latency for cache hits (one hop)+ Fewer points of failure+ Easier to debug and monitor+ Lower infrastructure cost
Disadvantages:
- Each PoP independently fetches from origin- Origin receives N * miss_rate requests (N = number of PoPs)- Cold PoPs have poor hit ratios- Popular content fetched redundantly by every PoP- Origin must handle high request volume
When appropriate:
- Small number of PoPs (<20)- Origin can handle the load- Content is highly cacheable (>95% hit ratio)- Low content diversity (small catalog)
Multi-Tier (Edge + Shield + Origin)
Architecture: Client → Edge PoP → Regional Shield → Origin
Two-tier variant:
Client → Edge → Shield → Origin
- Shield consolidates misses from 20-50 edge PoPs
- Origin sees 1/20th to 1/50th the miss traffic
Three-tier variant:
Client → Edge → Regional Mid-Tier → Global Shield → Origin
- Edge: hot content, small cache
- Regional mid-tier: warm content, medium cache
- Global shield: cold content, large cache
- Origin: only truly uncached content
Advantages:
+ Dramatically reduces origin load (60-90% reduction)
+ Better hit ratios at shield (aggregated demand)
+ Origin can be smaller/cheaper
+ Handles flash crowds better (shield absorbs)
+ Enables request coalescing at shield layer
Disadvantages:
- Additional latency on cache miss (extra hop)
- More complex architecture to operate
- Shield becomes a potential bottleneck/SPOF
- Higher infrastructure cost (shield servers)
- Debugging cache behavior across tiers is complex
- Stale content can persist longer (cached at multiple tiers)
When appropriate:
- Large number of PoPs (>50)
- Origin is expensive or capacity-limited
- Content catalog is large (long tail)
- Flash crowd protection is important
- Origin is geographically distant from most users
Proprietary CDN vs Multi-CDN vs Build-Your-Own
Single Proprietary CDN (CloudFront, Akamai, Cloudflare)
Advantages:
+ Turnkey solution, fast time to market
+ Global infrastructure already deployed
+ DDoS protection included
+ Managed SSL certificates
+ Edge compute capabilities
+ 24/7 NOC and support
+ Continuous platform improvements
Disadvantages:
- Vendor lock-in (proprietary APIs, edge functions)
- Limited customization
- Cost at scale ($0.02-0.15/GB vs $0.005 internal cost)
- Single point of failure (provider outage)
- Limited visibility into infrastructure
- Feature roadmap controlled by vendor
Cost at scale:
- 1 PB/month: ~$20,000-50,000/month
- 10 PB/month: ~$100,000-300,000/month
- 100 PB/month: ~$500,000-2,000,000/month
Multi-CDN Strategy
Advantages:
+ No single provider SPOF
+ Best performance per region (use best CDN per geo)
+ Cost optimization (competitive bidding)
+ Leverage each CDN's strengths
+ Negotiating leverage with providers
Disadvantages:
- Operational complexity (multiple dashboards, APIs)
- Cache fragmentation (content split across CDNs)
- Inconsistent feature sets across providers
- Complex purge coordination
- Higher total cost than single CDN (less volume discount)
- Need traffic management layer (DNS or client-side)
Implementation cost:
- Traffic management platform: $50K-200K/year
- Engineering overhead: 1-2 FTEs dedicated
- Monitoring across CDNs: additional tooling costs
Build-Your-Own CDN
Advantages:
+ Full control over every aspect
+ Lowest cost at massive scale (>100 PB/month)
+ Custom optimizations for specific use case
+ No vendor dependencies
+ Competitive advantage (unique capabilities)
Disadvantages:
- Enormous upfront investment ($50M-500M)
- 2-5 year build timeline to reach parity
- Requires specialized networking/systems talent (50-200 engineers)
- Ongoing operational burden (hardware, peering, NOC)
- Must build DDoS protection, WAF, etc.
- Regulatory compliance in each country
Break-even analysis:
- Below 50 PB/month: use managed CDN
- 50-500 PB/month: multi-CDN or hybrid (own + managed)
- Above 500 PB/month: build your own (Netflix, Google, Facebook)
Companies that built their own:
- Netflix (Open Connect): 100+ Tbps, ISP-embedded
- Google (GFE/Cloud CDN): integrated with search/YouTube
- Facebook (Edge PoPs): social content delivery
- Apple: iCloud and media delivery
Edge Compute vs Origin Compute
Edge Compute
Execute logic at CDN edge, close to users
Use cases:-A/B testing (route users to variants without origin)-Authentication/authorization (validate tokens at edge)-URL rewriting and redirects-Header manipulation (add security headers, CORS)-Image/video optimization (resize, format conversion)-Geolocation-based content (language, pricing)-Bot detection and blocking-Request/response transformationConstraints:-Limited CPU time (5-50ms typical)-Limited memory (128 MB typical)-No persistent storage (stateless)-Limited network access (restricted subrequests)-Cold start considerations-Debugging is harder (distributed execution)Best for:-Latency-sensitive logic-Simple transformations-Decisions that don't need backend data-High-volume, low-complexity operations
Origin Compute
Execute logic at centralized origin servers
Use cases:
- Complex business logic
- Database queries and transactions
- Machine learning inference (large models)
- Long-running computations
- Stateful operations
- Third-party API integrations
Advantages over edge:
+ Unlimited compute resources
+ Access to databases and state
+ Full programming language support
+ Easier debugging and monitoring
+ Simpler deployment model
+ No cold start concerns (always running)
Best for:
- Complex application logic
- Data-intensive operations
- Operations requiring consistency
- Long-running processes
- Operations needing large memory/CPU
Decision Framework
Factor
Edge
Origin
Latency requirement
<50ms
<500ms acceptable
Computation complexity
Simple
Complex
State needed
None/minimal
Database access
Request volume
Very high
Moderate
Personalization
Light (geo, device)
Heavy (user history)
Data dependencies
None
Multiple services
HTTP/2 Push vs Preload Hints
HTTP/2 Server Push (Deprecated)
How it worked:-Server proactively sends resources before client requests them-Pushed alongside the HTML response-Client receives CSS/JS without additional round tripsWhy it failed:-Pushed resources often already in browser cache (wasted bandwidth)-No way to know client's cache state before pushing-Complex implementation for marginal benefit-Removed from Chrome (2022), other browsers following-CDN implementation was inconsistentPerformance impact:-Best case: saved 1 RTT for critical resources-Worst case: wasted bandwidth pushing cached resources-Average: negligible improvement in real-world measurements
Preload Hints (103 Early Hints)
How it works:
- Server sends 103 Early Hints response before final response
- Contains Link: <resource>; rel=preload headers
- Browser begins fetching hinted resources immediately
- Final response (200) arrives with full content
Advantages over Server Push:
+ Browser checks cache before fetching (no waste)
+ Works with CDN caching (hints can be cached too)
+ Simpler implementation
+ Compatible with all HTTP versions
+ CDN can send hints while waiting for origin response
CDN implementation:
1. Client requests HTML page
2. CDN sends 103with preload hints (from cache or config)
3. CDN fetches HTML from origin (or cache)
4. Client already loading CSS/JS while waiting for HTML
5. CDN sends 200with HTML content
Performance benefit:
- Saves origin processing time (hints sent immediately)
- 100-500ms improvement for pages with slow origins
- No wasted bandwidth (browser respects cache)
Configuration example:
Link: </styles/main.css>; rel=preload; as=style
Link: </scripts/app.js>; rel=preload; as=script
Link: </fonts/inter.woff2>; rel=preload; as=font; crossorigin
Recommendation
- DoNOTuse HTTP/2 Server Push (deprecated, removed from browsers)
- DOuse103 Early Hints for critical resources
- DOuse <link rel="preload"> in HTML for important resources
- DOuse CDN-level early hints configuration
- Consider: preconnect hints for third-party origins