Uber Backend - Scale and Constraints
Traffic and Load Estimates
User Base Statistics
- Total Registered Users: 150 million riders globally
- Monthly Active Users: 130 million MAU (87% of registered)
- Daily Active Users: 25 million DAU (19% of MAU)
- Active Drivers: 6 million drivers globally
- Daily Active Drivers: 3 million drivers online daily
- Geographic Distribution: 10,000+ cities across 70+ countries
Trip Volume Analysis
- Daily Trips: 50 million trips per day globally
- Peak Hour Trips: 8 million trips during peak hours (6-9 PM)
- Average Trip Duration: 20 minutes per trip
- Concurrent Active Rides: 10 million rides at peak
- Trip Requests per Second: 5,000 requests/second average, 15,000 peak
- Annual Trip Volume: 18 billion trips per year
Request Patterns
Time Distribution (24-hour cycle):
00:00-06:00: 5% of daily trips (2.5M trips, 500 req/s)
06:00-09:00: 20% of daily trips (10M trips, 3,000 req/s)
09:00-12:00: 15% of daily trips (7.5M trips, 2,000 req/s)
12:00-14:00: 12% of daily trips (6M trips, 1,500 req/s)
14:00-18:00: 18% of daily trips (9M trips, 2,500 req/s)
18:00-21:00: 25% of daily trips (12.5M trips, 4,000 req/s)
21:00-24:00: 5% of daily trips (2.5M trips, 500 req/s)
Geographic Distribution:
North America: 35% (17.5M trips/day)
Europe: 25% (12.5M trips/day)
Asia-Pacific: 30% (15M trips/day)
Latin America: 8% (4M trips/day)
Other: 2% (1M trips/day)GPS and Location Data Scale
Location Update Volume
- Active Drivers Sending GPS: 3 million drivers during peak
- Update Frequency: Every 4 seconds per driver
- GPS Updates per Second: 750,000 updates/second
- Daily GPS Data Points: 65 billion location updates per day
- GPS Data Size: 100 bytes per update
- Daily GPS Storage: 6.5 TB of raw GPS data per day
Location Processing Requirements
- Real-time Processing: Process 750K GPS updates/second
- ETA Calculations: 15,000 ETA calculations/second during peak
- Route Calculations: 10,000 route calculations/second
- Geospatial Queries: 50,000 nearby driver queries/second
- Map Tile Requests: 100,000 map tile requests/second
- Geocoding Requests: 20,000 address lookups/second
Geospatial Index Size
- Active Driver Locations: 3 million points in real-time index
- Historical Trip Routes: 50 million trips × 300 points = 15B points
- City Boundaries: 10,000 cities with polygon definitions
- Service Areas: 50,000 geofenced zones globally
- Heat Map Cells: 100 million grid cells for demand/supply tracking
- Total Geospatial Data: 50 TB of geospatial indices
Matching and Dispatch Scale
Matching System Load
- Ride Requests per Second: 5,000 average, 15,000 peak
- Drivers Searched per Request: Average 50 drivers within search radius
- Total Driver Evaluations: 250,000 evaluations/second average
- Matching Decisions: 5,000 matches/second
- Failed Match Retries: 1,000 retries/second (20% initial failure rate)
- Scheduled Rides: 5 million scheduled rides in queue
Driver State Management
- Driver Status Updates: 100,000 status changes/second
- Driver Availability Checks: 500,000 checks/second
- Driver Location Updates: 750,000 updates/second
- Driver Acceptance/Rejection: 10,000 responses/second
- Driver Routing Updates: 50,000 route recalculations/second
- Driver Earnings Updates: 5,000 trip completions/second
Matching Algorithm Complexity
Per Ride Request:
1. Geospatial query: O(log n) for nearby drivers
2. Filter available drivers: O(k) where k = nearby drivers
3. Score and rank drivers: O(k log k)
4. Send requests: O(1) per driver
5. Wait for acceptance: Timeout-based
Total Complexity: O(k log k) per request
With k=50 average: ~300 operations per request
At 15,000 req/s peak: 4.5M operations/secondPayment Processing Scale
Transaction Volume
- Daily Transactions: 50 million payment transactions
- Transactions per Second: 580 TPS average, 1,800 TPS peak
- Payment Methods: 60% cards, 25% digital wallets, 10% cash, 5% corporate
- Average Transaction Value: $15 per trip
- Daily Payment Volume: $750 million processed daily
- Annual GMV: $270 billion gross merchandise value
Payment Processing Requirements
- Authorization Requests: 1,800 requests/second peak
- Capture Requests: 1,800 captures/second peak
- Refund Requests: 100 refunds/second
- Payout to Drivers: 3 million payouts per day
- Split Payments: 5 million split payment calculations/day
- Fraud Checks: 1,800 fraud evaluations/second
Financial Data Storage
- Transaction Records: 50M transactions/day × 1KB = 50 GB/day
- Payment Methods: 150M users × 2 methods × 500 bytes = 150 GB
- Driver Earnings: 6M drivers × 365 days × 2KB = 4.4 TB/year
- Invoices and Receipts: 50M receipts/day × 10KB = 500 GB/day
- Audit Logs: 100M events/day × 500 bytes = 50 GB/day
- Total Financial Data: 600 GB/day, 220 TB/year
Data Storage Requirements
Database Storage Breakdown
User Data:
- Rider Profiles: 150M × 5KB = 750 GB
- Driver Profiles: 6M × 20KB = 120 GB
- Payment Methods: 150M × 1KB = 150 GB
- Preferences: 150M × 2KB = 300 GB
Total User Data: ~1.3 TB
Trip Data:
- Active Trips: 10M × 10KB = 100 GB
- Daily Trip Records: 50M × 5KB = 250 GB/day
- Trip History (2 years): 36.5B × 5KB = 183 TB
- GPS Trails: 50M trips × 300 points × 100 bytes = 1.5 TB/day
Total Trip Data: 200+ TB
Location Data:
- Real-time Driver Locations: 3M × 200 bytes = 600 MB
- Historical GPS Data (90 days): 6.5TB/day × 90 = 585 TB
- Heat Maps: 100M cells × 1KB = 100 GB
Total Location Data: 600+ TB
Payment Data:
- Transaction History: 220 TB/year
- Payment Methods: 150 GB
- Invoices: 500 GB/day × 365 = 183 TB/year
Total Payment Data: 400+ TB
Analytics and Logs:
- Application Logs: 10 TB/day
- Analytics Events: 5 TB/day
- Audit Logs: 50 GB/day
Total Logs: 15 TB/day, 5.5 PB/year
Grand Total Storage: 1.5 PB active data, 6 PB total with historyCache Requirements
- Driver Location Cache: 3M drivers × 500 bytes = 1.5 GB
- Rider Session Cache: 10M active sessions × 10KB = 100 GB
- Surge Pricing Cache: 100K zones × 1KB = 100 MB
- ETA Cache: 50M cached ETAs × 500 bytes = 25 GB
- User Profile Cache: 25M hot profiles × 5KB = 125 GB
- Total Cache Memory: 250 GB across distributed cache cluster
Network Bandwidth Requirements
Data Transfer Volume
Inbound Traffic:
- GPS Updates: 750K/s × 100 bytes = 75 MB/s = 6.5 TB/day
- API Requests: 50K req/s × 2KB = 100 MB/s = 8.6 TB/day
- Image Uploads: 10K uploads/s × 500KB = 5 GB/s = 432 TB/day
Total Inbound: 5.2 GB/s peak, 450 TB/day
Outbound Traffic:
- Map Tiles: 100K req/s × 50KB = 5 GB/s = 432 TB/day
- API Responses: 50K resp/s × 5KB = 250 MB/s = 21.6 TB/day
- Push Notifications: 100K notif/s × 1KB = 100 MB/s = 8.6 TB/day
- Real-time Updates: 1M updates/s × 500 bytes = 500 MB/s = 43 TB/day
Total Outbound: 5.8 GB/s peak, 505 TB/day
Total Bandwidth: 11 GB/s peak, 955 TB/dayCDN and Edge Requirements
- Static Assets: 500 GB of app assets, images, maps
- CDN Cache Hit Rate: 95% for static content
- Edge Locations: 200+ PoPs globally
- CDN Bandwidth: 5 GB/s for cached content
- Origin Bandwidth: 250 MB/s for cache misses
Compute Requirements
Application Server Capacity
- API Servers: 5,000 instances (16 vCPU, 32GB RAM each)
- Matching Service: 2,000 instances (32 vCPU, 64GB RAM each)
- Location Service: 3,000 instances (16 vCPU, 32GB RAM each)
- Payment Service: 1,000 instances (8 vCPU, 16GB RAM each)
- Notification Service: 500 instances (8 vCPU, 16GB RAM each)
- Total Compute: 200,000 vCPUs, 400 TB RAM
Database Capacity
- Primary Databases: 500 shards (64 vCPU, 256GB RAM each)
- Read Replicas: 1,500 replicas (32 vCPU, 128GB RAM each)
- Cache Clusters: 200 nodes (16 vCPU, 128GB RAM each)
- Analytics Databases: 100 nodes (64 vCPU, 512GB RAM each)
- Total Database Compute: 100,000 vCPUs, 250 TB RAM
Message Queue Capacity
- Kafka Brokers: 500 brokers (16 vCPU, 64GB RAM each)
- Message Throughput: 10 million messages/second
- Message Retention: 7 days of message history
- Storage per Broker: 10 TB NVMe SSD
- Total Queue Storage: 5 PB
Cost Estimates
Infrastructure Costs (Monthly)
Compute:
- Application Servers: $500K/month
- Database Servers: $800K/month
- Cache Clusters: $100K/month
- Message Queues: $200K/month
Total Compute: $1.6M/month
Storage:
- Database Storage: $300K/month
- Object Storage: $200K/month
- Backup Storage: $100K/month
Total Storage: $600K/month
Network:
- Data Transfer: $400K/month
- CDN: $300K/month
- Load Balancers: $50K/month
Total Network: $750K/month
Third-Party Services:
- Maps API: $500K/month
- Payment Processing: $200K/month (+ 2.5% transaction fees)
- SMS/Push Notifications: $100K/month
- Monitoring/Logging: $50K/month
Total Third-Party: $850K/month
Grand Total: $3.8M/month infrastructure
Transaction Fees: $187M/month (2.5% of $7.5B monthly GMV)
Total Operating Cost: $191M/month, $2.3B/yearCost per Trip
- Infrastructure Cost: $0.076 per trip ($3.8M / 50M trips)
- Transaction Fees: $0.375 per trip (2.5% of $15 average)
- Total Cost per Trip: $0.45
- Revenue per Trip: $3.75 (25% take rate on $15 trip)
- Gross Margin: $3.30 per trip (88% margin)
Scaling Bottlenecks
Critical Bottlenecks
- Geospatial Queries: 250K driver evaluations/second
- GPS Data Ingestion: 750K location updates/second
- Real-time Matching: 15K matching decisions/second
- Payment Processing: 1,800 transactions/second
- Database Writes: 100K writes/second across all shards
- Cache Invalidation: Coordinating cache updates across regions
Mitigation Strategies
- Geospatial: Distributed geospatial indices with regional sharding
- GPS Ingestion: Kafka-based buffering with batch processing
- Matching: Parallel matching with pre-filtered driver pools
- Payments: Async payment processing with eventual consistency
- Database: Aggressive sharding and read replica scaling
- Cache: Regional cache clusters with lazy invalidation
This scale analysis demonstrates the massive infrastructure required to operate a global ride-sharing platform at Uber's scale, processing billions of events daily while maintaining sub-second response times.