Google Search - Database Design
Storage Systems Overview
Primary Storage Technologies
- Bigtable: Structured data (crawl metadata, index metadata)
- GFS/Colossus: Large files (index shards, crawled content)
- Spanner: Globally distributed data (user data, settings)
- Memcache/Redis: Caching layer
Inverted Index Schema
Index Structure
Inverted Index (Key-Value Store):
Key: Term (string)
Value: Posting List (compressed binary)
Posting List Entry:
{
doc_id: uint64, // Document identifier
term_frequency: uint16, // Occurrences in document
positions: []uint32, // Word positions
field_flags: uint8, // Title, body, anchor, etc.
proximity_score: float32 // Distance to other terms
}
Example:
"machine learning" → [
{doc_id: 12345, tf: 5, positions: [10, 45, 67, 89, 120],
field_flags: 0b00000011, proximity: 0.95},
{doc_id: 67890, tf: 2, positions: [5, 234],
field_flags: 0b00000001, proximity: 0.87},
...
]
Compression:
- Delta encoding for doc_ids
- Variable-length encoding for positions
- 10x compression ratio
- 1 PB compressed from 10 PB rawIndex Sharding
Sharding Strategy: Hash-based by term
Shard Assignment:
shard_id = hash(term) % 10000
Shard Size:
- 10,000 shards total
- 10 GB per shard (compressed)
- 100 GB per shard (uncompressed)
- 3x replication
Benefits:
- Even distribution of terms
- Parallel query processing
- Independent shard updates
- Fault isolationDocument Store Schema
Document Metadata (Bigtable)
Table: documents
Row Key: doc_id (uint64)
Column Families:
- content: HTML, text, title
- metadata: URL, crawl_time, language
- links: outlinks, inlinks
- quality: PageRank, spam_score
- engagement: CTR, dwell_time
Schema:
{
doc_id: 12345,
content: {
url: "https://example.com/page",
title: "Example Page Title",
text: "Page content...",
html: "<html>...</html>",
snippet: "Preview text..."
},
metadata: {
crawl_time: 1704715200,
last_modified: 1704700000,
language: "en",
content_type: "text/html",
size_bytes: 45678
},
links: {
outlinks: [67890, 11111, 22222],
inlinks: [33333, 44444],
anchor_texts: ["example", "sample page"]
},
quality: {
pagerank: 0.00234,
spam_score: 0.05,
authority_score: 0.78,
trust_score: 0.92
},
engagement: {
impressions: 10000,
clicks: 500,
ctr: 0.05,
avg_dwell_time: 45.3
}
}URL Frontier (Crawl Queue)
Table: url_frontier
Row Key: priority_score + url_hash
Columns:
- url: Full URL
- priority: Crawl priority (0-100)
- last_crawl: Last crawl timestamp
- crawl_frequency: How often to crawl
- depth: Distance from seed URLs
- domain: Domain name
Priority Calculation:
priority = (pagerank × 0.4) +
(freshness_need × 0.3) +
(link_count × 0.2) +
(user_interest × 0.1)
Sharding:
- Shard by domain (politeness)
- Each crawler handles specific domains
- Avoid overloading single domainLink Graph Schema
Link Graph Storage
Table: link_graph
Row Key: source_doc_id
Columns:
- outlinks: [target_doc_id, anchor_text, position]
- inlink_count: Number of inbound links
- pagerank: Computed PageRank score
Example:
{
source_doc_id: 12345,
outlinks: [
{target: 67890, anchor: "machine learning", pos: 45},
{target: 11111, anchor: "AI tutorial", pos: 123}
],
inlink_count: 234,
pagerank: 0.00234
}
Reverse Index (for inlinks):
Table: reverse_links
Row Key: target_doc_id
Columns: [source_doc_id, anchor_text]PageRank Computation
Algorithm: Power Iteration
Pseudocode:
PR(A) = (1-d) + d × Σ(PR(Ti) / C(Ti))
Where:
- PR(A): PageRank of page A
- d: Damping factor (0.85)
- Ti: Pages linking to A
- C(Ti): Outlink count of Ti
Implementation:
1. Initialize all pages: PR = 1/N
2. Iterate 20-30 times:
- For each page: Calculate new PR
- Update PR values
3. Converge when change < threshold
4. Store final PR scores
Distributed Computation:
- MapReduce for parallel processing
- Process 100B pages in 24 hours
- Update monthlyQuery Log Schema
Query Logs (Bigtable)
Table: query_logs
Row Key: timestamp + user_id + query_hash
Columns:
- query: Query text
- user_id: User identifier (hashed)
- timestamp: Query time
- results_shown: Top 10 result doc_ids
- clicks: Clicked result positions
- dwell_times: Time spent on each result
- location: User location (city-level)
- device: Desktop, mobile, tablet
Example:
{
query: "machine learning tutorial",
user_id: "hash_abc123",
timestamp: 1704715200,
results_shown: [12345, 67890, 11111, ...],
clicks: [0, 2], // Clicked 1st and 3rd results
dwell_times: [45.3, 120.5], // Seconds
location: "San Francisco, CA",
device: "mobile"
}
Usage:
- Ranking model training
- Query understanding
- Autocomplete suggestions
- Trending queriesAutocomplete Data
Autocomplete Trie
Data Structure: Prefix Tree (Trie)
Node Structure:
{
prefix: "mach",
children: {
'i': Node, // "machi"
'o': Node // "macho"
},
suggestions: [
{query: "machine learning", score: 95, count: 1M},
{query: "machine translation", score: 78, count: 500K}
]
}
Scoring:
score = (query_frequency × 0.6) +
(recency × 0.2) +
(personalization × 0.2)
Storage:
- 100M unique queries
- 1B prefix nodes
- 10 GB compressed
- Replicated to edge locationsCache Schema
Query Result Cache (Redis)
Key: query_hash + location + language
Value: Serialized search results
TTL: 1 hour (popular queries), 5 minutes (fresh queries)
Example:
Key: "sha256(machine learning):SF:en"
Value: {
results: [
{doc_id: 12345, title: "...", snippet: "...", url: "..."},
...
],
total_results: 45000000,
query_time_ms: 234,
cached_at: 1704715200
}
Cache Invalidation:
- TTL-based expiration
- Manual invalidation for updated pages
- LRU eviction for memory managementIndex Metadata
Shard Metadata (Bigtable)
Table: index_shards
Row Key: shard_id
Columns:
- shard_id: Shard identifier (0-9999)
- term_range: [start_term, end_term]
- doc_count: Documents in shard
- term_count: Unique terms in shard
- size_bytes: Shard size
- last_updated: Last update timestamp
- servers: Servers hosting this shard
Example:
{
shard_id: 1234,
term_range: ["machine", "magazine"],
doc_count: 10000000,
term_count: 50000,
size_bytes: 10737418240, // 10 GB
last_updated: 1704715200,
servers: ["server-1234", "server-5678", "server-9012"]
}User Data Schema
User Profiles (Spanner)
Table: users
Primary Key: user_id
Columns:
- user_id: UUID
- email: Encrypted email
- preferences: JSON (language, safe search, etc.)
- search_history: Recent queries (encrypted)
- location: Default location
- created_at: Account creation time
Example:
{
user_id: "uuid-123",
email: "encrypted_email",
preferences: {
language: "en",
safe_search: true,
results_per_page: 10
},
search_history: ["encrypted_query_1", "encrypted_query_2"],
location: "San Francisco, CA",
created_at: 1704715200
}
Privacy:
- Encrypted at rest
- Anonymized for analytics
- GDPR compliant
- User can delete historyData Retention
Retention Policies
Crawled Content:
- Current version: Indefinite
- Historical versions: 90 days
- Deleted pages: 30 days (for recovery)
Index Data:
- Active index: Indefinite
- Incremental updates: 7 days
- Old index versions: 30 days
Query Logs:
- Detailed logs: 18 months
- Aggregated data: 5 years
- Anonymized data: Indefinite
User Data:
- Active users: Indefinite
- Inactive users (>2 years): Anonymize
- Deleted accounts: 30 days grace periodBackup and Recovery
Backup Strategy
Index Backups:
- Full backup: Weekly
- Incremental backup: Daily
- Snapshot: Every 6 hours
- Retention: 30 days
Recovery:
- Point-in-time recovery
- Shard-level recovery
- Cross-region replication
- RTO: 4 hours, RPO: 6 hoursThis database design supports Google Search's massive scale with distributed storage, efficient indexing, and fast query processing while maintaining data consistency and reliability.