Scaling Considerations

📖 2 min read 📄 Part 6 of 10

Design Facebook Newsfeed - Scaling Considerations

Horizontal Scaling

Application Layer

  • 200,000+ stateless application servers
  • Auto-scaling based on CPU/memory
  • Independent scaling per service
  • Geographic distribution

Database Scaling

  • 5,000 database shards
  • 5 read replicas per shard
  • Async replication
  • Cross-region replication

Caching Strategy

  • L1: Application memory (1 min TTL)
  • L2: Memcached 100TB (5-60 min TTL)
  • L3: CDN (24 hours - 7 days TTL)
  • 95%+ cache hit rate target

ML-Based Feed Ranking

Ranking Model

Score = f(
  affinity (user-author relationship),
  weight (post type),
  time_decay (recency),
  engagement (likes, comments),
  content_quality,
  diversity
)

Model: Gradient Boosted Decision Trees
Training: Offline on historical data
Serving: Real-time feature computation
Update: Daily retraining

Feature Engineering

  • User features: demographics, interests, behavior
  • Post features: type, length, media, engagement
  • Context features: time, device, location
  • Interaction features: past engagement with author

Handling Celebrity Users

Fan-out Strategy

Regular (<1K friends): Fan-out on write
Power (1K-100K): Partial fan-out
Celebrity (>100K): Fan-out on read

Celebrity Post Handling:
1. Skip fan-out
2. Store in celebrity_posts table
3. Fetch on-demand when user requests feed
4. Cache aggressively (1 hour TTL)

Performance Optimization

Feed Generation

  • Pre-compute feeds for active users
  • Pull model for celebrity content
  • ML ranking in real-time
  • Cache results (5 min TTL)

Media Delivery

  • CDN with 90% hit rate
  • Multiple image sizes
  • Video adaptive bitrate
  • Lazy loading

Auto-Scaling

Triggers

  • CPU > 70%: Scale up
  • Memory > 80%: Scale up
  • Request queue > 1000: Scale up
  • Response time > 1s: Scale up

Policies

  • Target tracking: Maintain CPU at 60%
  • Step scaling: Add servers based on load
  • Scheduled: Scale for peak hours
  • Predictive: ML-based prediction

Monitoring

Key Metrics

  • Feed load time (p95 <1s)
  • Post creation latency (p95 <500ms)
  • Reaction latency (p95 <100ms)
  • Cache hit rate (>95%)
  • Error rate (<0.1%)

Alerting

  • Critical: Error rate >5%, page immediately
  • Warning: Error rate >1%, notify team
  • Info: Performance degradation

This scaling strategy ensures Facebook can handle billions of users with sophisticated ML-based ranking.