-- Stored in-memory and on local SSD at each edge server-- Optimized for fast lookup by URL hash
CacheEntry {
url_hash: BYTES(32) -- SHA-256 of normalized URL (primary key)
original_url: STRING(4096) -- Full URL including query params
content_hash: BYTES(32) -- SHA-256 of response body (for dedup)
content_type: STRING(128) -- MIME type (e.g., "image/webp")
content_encoding: STRING(32) -- Compression (gzip, br, identity)
content_length: INT64 -- Size in bytes-- Cache control
ttl_seconds: INT32 -- Time-to-live from origin
created_at: TIMESTAMP-- When cached at this edge
expires_at: TIMESTAMP-- Absolute expiration time
last_modified: TIMESTAMP-- Origin Last-Modified header
etag: STRING(256) -- Origin ETag for conditional requests-- Vary handling
vary_headers: STRING(512) -- Vary header value
vary_key: BYTES(32) -- Hash of varied header values-- Storage location
disk_path: STRING(512) -- Path on local SSD/HDD
disk_offset: INT64 -- Offset within storage file
memory_cached: BOOL -- Currently in RAM cache-- Access patterns
hit_count: INT64 -- Total hits since cached
last_accessed: TIMESTAMP-- For LRU eviction
access_frequency: FLOAT32 -- Exponentially weighted moving avg-- Response headers to preserve
response_headers: MAP<STRING, STRING>
status_code: INT16 -- Original response status-- Metadata
origin_id: STRING(64) -- Which origin served this
distribution_id: STRING(64) -- Customer distribution identifier
is_negative_cache: BOOL -- Cached 404/error response
}
Variant Storage (for Vary header support)
CacheVariant {
url_hash: BYTES(32) -- Same URL
variant_key: BYTES(32) -- Hash of vary header values
cache_entry_id: BYTES(32) -- Points to specific CacheEntry
vary_values: MAP<STRING, STRING>-- Actual header values
}
Cache Index Data Structures
Primary Index: Hash Table (In-Memory)
Architecture: Open-addressing hash tablewith linear probing
- Key: SHA-256(normalized_url + variant_key) = 32 bytes
- Value: Pointer to metadata + disk location = 24 bytes
- Load factor: 0.7 (resize at 70% capacity)
- Total entries per server: 500M - 2B objects
- Memory usage: ~30-100 GB per server
Lookup complexity: O(1) average, O(n) worst caseInsert complexity: O(1) amortized
LRU Eviction Structure
Architecture: Segmented LRU (SLRU) with two segments
Probationary Segment (20%of cache):
- New objects enter here
- Evicted first when space needed
- Promoted toprotectedon second access
Protected Segment (80%of cache):
- Frequently accessed objects
- Evicted only when probationary is empty
- Demoted to probationary tail on eviction
Implementation: Doubly-linked list with hash map for O(1) ops
- Move to head on access: O(1)
- Evict from tail: O(1)
- Lookup bykey: O(1) via hash map
Bloom Filters (Negative Lookup Optimization)
Purpose: Quickly determine if an object is NOT in cache
- Avoids expensive disk lookups for uncached objects
- False positive rate: 1% (tunable)
- Size: ~10 bits per object = 1.2 GB for 1 billion objects
- Hash functions: 7 (optimal for 1% FPR)
Usage flow:
1. Request arrives - check bloom filter
2. If "no" - guaranteed miss - fetch from origin
3. If "maybe" - check hash table - may still be miss
Consistent Hash Ring (Intra-PoP Distribution)
Purpose: Distribute objects across servers within a PoP
- Virtual nodes: 150-200 per physical server
- Hash function: xxHash64 for speed
- Rebalancing on server add/remove: only 1/N objects move
- Weighted nodes: more capacity = more virtual nodes
Ring structure:
- Sorted array of (hash_value, server_id) pairs
- Binary search for lookup: O(log N)
- N = num_virtual_nodes (typically 10,000-50,000 per PoP)
Hot tier (0-24 hours): Kafka/Kinesis for real-time analytics
Warm tier (1-30 days): ClickHouse/Druid for ad-hoc queries
Cold tier (30-365 days): S3/GCS in Parquet for compliance
Archive (1+ years): Glacier/Archive tier for legal hold
Compression: 10-20x with columnar + zstd
Partitioning: By distribution_id, date, edge_region
GeoDNS Resolution Flow:
1. Client DNS query arrives at authoritative nameserver
2. Extract client IP (or EDNS Client Subnet if available)
3. Map IP to geographic location (MaxMind GeoIP database)
4. Query routing policies for matching geo rules
5. Apply health check status (exclude unhealthy endpoints)
6. Return closest healthy edge PoP IP(s)
7. TTL: 30-60 seconds (balance freshness vs DNS load)
Anycast Resolution Flow:
1. All edge PoPs announce same IP prefix via BGP
2. Internet routing naturally directs to nearest PoP
3. No DNS-level geographic routing needed
4. Failover: withdraw BGP announcement from unhealthy PoP