Scale & Constraints

📖 7 min read 📄 Part 2 of 10

Scale Constraints for Shopify Platform

Estimated reading time: 20 minutes

Overview

Shopify-scale platforms must handle massive multi-tenant workloads with varying traffic patterns, from small businesses with occasional sales to enterprise merchants processing thousands of orders per minute during flash sales.

Traffic and User Scale

1. Merchant Scale

Active Merchants:

  • 1M+ active stores globally
  • 100K+ new stores created monthly
  • 10K+ enterprise-level merchants
  • 500K+ merchants with custom domains

Geographic Distribution:

  • 175+ countries supported
  • 50+ languages and currencies
  • 24/7 global traffic patterns
  • Regional compliance requirements

Merchant Tiers:

class MerchantTiers:
    BASIC = {
        'max_products': 1000,
        'max_orders_per_month': 1000,
        'storage_limit': '1GB',
        'bandwidth_limit': '10GB'
    }
    
    PROFESSIONAL = {
        'max_products': 10000,
        'max_orders_per_month': 10000,
        'storage_limit': '10GB',
        'bandwidth_limit': '100GB'
    }
    
    ENTERPRISE = {
        'max_products': 'unlimited',
        'max_orders_per_month': 'unlimited',
        'storage_limit': '1TB',
        'bandwidth_limit': '10TB'
    }

2. Customer Traffic Scale

Concurrent Users:

  • 10M+ concurrent shoppers during peak events
  • 1M+ concurrent admin users (merchants)
  • 100K+ concurrent API requests
  • 50K+ concurrent checkout processes

Traffic Patterns:

class TrafficPatterns:
    def get_peak_multipliers(self):
        return {
            'black_friday': 20,      # 20x normal traffic
            'cyber_monday': 15,      # 15x normal traffic
            'holiday_season': 5,     # 5x normal traffic
            'flash_sales': 10,       # 10x normal traffic
            'product_launches': 8    # 8x normal traffic
        }
    
    def get_regional_peaks(self):
        return {
            'us_east': '14:00-16:00 EST',    # Lunch hour shopping
            'us_west': '19:00-21:00 PST',    # Evening shopping
            'europe': '20:00-22:00 CET',     # Evening shopping
            'asia_pacific': '12:00-14:00 JST' # Lunch hour shopping
        }

Data Scale Requirements

1. Product Catalog Scale

Product Data Volume:

  • 100M+ products across all stores
  • 1B+ product variants (size, color, etc.)
  • 10B+ product images and media files
  • 1TB+ of product data daily

Product Catalog Growth:

-- Daily product creation estimates
SELECT 
    'Small merchants' as segment,
    50000 as new_products_daily,
    '10MB' as avg_product_size
UNION ALL
SELECT 
    'Medium merchants',
    20000,
    '25MB'
UNION ALL
SELECT 
    'Enterprise merchants',
    5000,
    '100MB';

2. Transaction Scale

Order Volume:

  • 50M+ orders processed monthly
  • 2M+ orders during peak days
  • 100K+ orders per hour during flash sales
  • $100B+ gross merchandise value (GMV) annually

Transaction Data Growth:

class TransactionScale:
    def calculate_daily_data_growth(self):
        return {
            'orders': {
                'count': 1_600_000,        # Daily orders
                'avg_size_kb': 5,          # Per order record
                'daily_growth_gb': 8       # Total daily growth
            },
            'order_items': {
                'count': 4_800_000,        # 3 items per order avg
                'avg_size_kb': 2,
                'daily_growth_gb': 9.6
            },
            'payments': {
                'count': 1_600_000,
                'avg_size_kb': 3,
                'daily_growth_gb': 4.8
            }
        }

3. Media and Asset Scale

File Storage Requirements:

  • 100TB+ of product images
  • 50TB+ of theme assets and templates
  • 20TB+ of merchant uploaded content
  • 10TB+ of system generated reports

CDN Distribution:

class CDNScale:
    def get_cdn_requirements(self):
        return {
            'global_pops': 200,           # Points of presence
            'cache_hit_ratio': 0.95,      # 95% cache hit rate
            'bandwidth_peak': '10Tbps',   # Peak bandwidth
            'storage_capacity': '1PB',    # Total cached content
            'purge_requests': 1_000_000   # Daily cache purges
        }

Performance Scale Targets

1. Response Time Requirements

Storefront Performance:

class StorefrontSLA:
    def get_performance_targets(self):
        return {
            'homepage_load': {
                'p50': 800,    # 800ms
                'p95': 1500,   # 1.5s
                'p99': 2500    # 2.5s
            },
            'product_page_load': {
                'p50': 1000,   # 1s
                'p95': 2000,   # 2s
                'p99': 3000    # 3s
            },
            'checkout_process': {
                'p50': 1200,   # 1.2s
                'p95': 2500,   # 2.5s
                'p99': 4000    # 4s
            },
            'search_results': {
                'p50': 300,    # 300ms
                'p95': 800,    # 800ms
                'p99': 1500    # 1.5s
            }
        }

Admin Dashboard Performance:

class AdminSLA:
    def get_performance_targets(self):
        return {
            'dashboard_load': {
                'p50': 500,    # 500ms
                'p95': 1000,   # 1s
                'p99': 2000    # 2s
            },
            'product_management': {
                'p50': 600,    # 600ms
                'p95': 1200,   # 1.2s
                'p99': 2500    # 2.5s
            },
            'order_processing': {
                'p50': 400,    # 400ms
                'p95': 800,    # 800ms
                'p99': 1500    # 1.5s
            },
            'analytics_reports': {
                'p50': 2000,   # 2s
                'p95': 5000,   # 5s
                'p99': 10000   # 10s
            }
        }

2. Throughput Requirements

API Throughput:

class APIThroughput:
    def get_throughput_targets(self):
        return {
            'storefront_api': {
                'rps_normal': 100_000,     # Requests per second
                'rps_peak': 500_000,       # Peak traffic
                'concurrent_connections': 1_000_000
            },
            'admin_api': {
                'rps_normal': 50_000,
                'rps_peak': 200_000,
                'concurrent_connections': 500_000
            },
            'webhook_delivery': {
                'rps_normal': 10_000,
                'rps_peak': 50_000,
                'retry_capacity': 100_000
            }
        }

Database Scale Constraints

1. Relational Database Scale

Primary Database Clusters:

class DatabaseScale:
    def get_database_requirements(self):
        return {
            'primary_cluster': {
                'read_replicas': 20,       # Per region
                'write_capacity': '100K TPS',
                'read_capacity': '1M QPS',
                'storage_size': '100TB',
                'connection_pool': 10_000
            },
            'analytics_cluster': {
                'nodes': 50,
                'storage_size': '500TB',
                'query_capacity': '10K concurrent',
                'data_retention': '7 years'
            },
            'search_cluster': {
                'elasticsearch_nodes': 100,
                'index_size': '50TB',
                'search_qps': '100K',
                'indexing_rate': '1M docs/min'
            }
        }

2. Sharding Strategy

Tenant-Based Sharding:

-- Sharding by merchant_id for data isolation
CREATE TABLE orders_shard_1 (
    order_id UUID PRIMARY KEY,
    merchant_id UUID,
    customer_id UUID,
    total_amount DECIMAL(15,2),
    created_at TIMESTAMP
) PARTITION BY HASH (merchant_id);

-- Geographic sharding for compliance
CREATE TABLE customers_us (
    customer_id UUID PRIMARY KEY,
    merchant_id UUID,
    email VARCHAR(255),
    region VARCHAR(10) DEFAULT 'US'
);

CREATE TABLE customers_eu (
    customer_id UUID PRIMARY KEY,
    merchant_id UUID,
    email VARCHAR(255),
    region VARCHAR(10) DEFAULT 'EU'
);

Infrastructure Scale Requirements

1. Compute Resources

Application Servers:

class ComputeScale:
    def get_compute_requirements(self):
        return {
            'web_servers': {
                'instances_normal': 1000,
                'instances_peak': 5000,
                'cpu_cores_per_instance': 8,
                'memory_per_instance': '32GB',
                'auto_scaling_target': '70% CPU'
            },
            'api_servers': {
                'instances_normal': 500,
                'instances_peak': 2000,
                'cpu_cores_per_instance': 16,
                'memory_per_instance': '64GB',
                'auto_scaling_target': '60% CPU'
            },
            'background_workers': {
                'instances_normal': 200,
                'instances_peak': 1000,
                'cpu_cores_per_instance': 4,
                'memory_per_instance': '16GB',
                'queue_processing_capacity': '100K jobs/min'
            }
        }

2. Storage Infrastructure

Distributed Storage:

class StorageScale:
    def get_storage_requirements(self):
        return {
            'object_storage': {
                'capacity': '10PB',        # Petabytes
                'iops': '10M',             # I/O operations per second
                'throughput': '100GB/s',   # Data transfer rate
                'durability': '99.999999999%'  # 11 nines
            },
            'block_storage': {
                'capacity': '1PB',
                'iops': '1M',
                'throughput': '50GB/s',
                'replication_factor': 3
            },
            'cache_storage': {
                'redis_clusters': 50,
                'memory_per_cluster': '1TB',
                'total_cache_capacity': '50TB',
                'hit_ratio_target': '95%'
            }
        }

Network and Bandwidth Scale

1. Global Network Requirements

Bandwidth Allocation:

class NetworkScale:
    def get_bandwidth_requirements(self):
        return {
            'total_bandwidth': '1Tbps',    # Peak capacity
            'regional_distribution': {
                'north_america': '400Gbps',
                'europe': '300Gbps',
                'asia_pacific': '200Gbps',
                'other_regions': '100Gbps'
            },
            'cdn_bandwidth': '800Gbps',    # CDN offload
            'api_bandwidth': '200Gbps',    # API traffic
            'admin_bandwidth': '50Gbps'    # Admin interfaces
        }

2. Connection Scaling

Connection Management:

class ConnectionScale:
    def get_connection_limits(self):
        return {
            'load_balancer_connections': 10_000_000,
            'database_connections': 100_000,
            'cache_connections': 500_000,
            'websocket_connections': 1_000_000,
            'webhook_connections': 100_000
        }

Availability and Reliability Scale

1. Uptime Requirements

Service Level Objectives:

class AvailabilityTargets:
    def get_slo_targets(self):
        return {
            'storefront_availability': {
                'target': '99.99%',        # 4.32 minutes downtime/month
                'error_budget': '0.01%',
                'measurement_window': '30 days'
            },
            'admin_availability': {
                'target': '99.9%',         # 43.2 minutes downtime/month
                'error_budget': '0.1%',
                'measurement_window': '30 days'
            },
            'api_availability': {
                'target': '99.95%',        # 21.6 minutes downtime/month
                'error_budget': '0.05%',
                'measurement_window': '30 days'
            }
        }

2. Disaster Recovery Scale

Recovery Requirements:

class DisasterRecovery:
    def get_recovery_targets(self):
        return {
            'rto': {                       # Recovery Time Objective
                'critical_services': '15 minutes',
                'standard_services': '1 hour',
                'non_critical_services': '4 hours'
            },
            'rpo': {                       # Recovery Point Objective
                'transactional_data': '1 minute',
                'user_data': '5 minutes',
                'analytics_data': '1 hour'
            },
            'backup_retention': {
                'daily_backups': '30 days',
                'weekly_backups': '12 weeks',
                'monthly_backups': '7 years'
            }
        }

Security Scale Requirements

1. Authentication and Authorization

Identity Management Scale:

class SecurityScale:
    def get_security_requirements(self):
        return {
            'authentication': {
                'login_requests_per_second': 50_000,
                'token_validations_per_second': 500_000,
                'password_reset_requests_daily': 100_000,
                'mfa_verifications_daily': 1_000_000
            },
            'authorization': {
                'permission_checks_per_second': 1_000_000,
                'role_based_access_policies': 10_000,
                'resource_level_permissions': 100_000_000
            }
        }

2. Security Monitoring Scale

Threat Detection:

class ThreatDetection:
    def get_monitoring_scale(self):
        return {
            'log_ingestion': '1TB/hour',   # Security logs
            'events_processed': '10M/minute',
            'threat_alerts_daily': 10_000,
            'false_positive_rate': '< 5%',
            'detection_latency': '< 30 seconds'
        }

Cost Optimization Constraints

1. Infrastructure Costs

Cost Targets:

class CostConstraints:
    def get_cost_targets(self):
        return {
            'infrastructure_cost_per_merchant': '$50/month',
            'cdn_cost_percentage': '15%',      # Of total infra cost
            'database_cost_percentage': '30%',
            'compute_cost_percentage': '40%',
            'storage_cost_percentage': '15%'
        }

2. Efficiency Metrics

Resource Utilization:

class EfficiencyTargets:
    def get_utilization_targets(self):
        return {
            'cpu_utilization': '70%',          # Target average
            'memory_utilization': '80%',
            'storage_utilization': '85%',
            'network_utilization': '60%',
            'cache_hit_ratio': '95%'
        }

Monitoring and Observability Scale

1. Metrics Collection

Telemetry Scale:

class ObservabilityScale:
    def get_telemetry_requirements(self):
        return {
            'metrics_ingestion': '100M data points/minute',
            'log_ingestion': '10TB/day',
            'trace_ingestion': '1B spans/day',
            'alert_evaluations': '1M/minute',
            'dashboard_queries': '100K/minute'
        }

These scale constraints define the massive requirements for a Shopify-level platform, emphasizing the need for careful architecture decisions around multi-tenancy, data partitioning, and global distribution.