Load Balancer - Security and Privacy
DDoS Protection
SYN Flood Protection
Attack: Attacker sends massive volume of TCP SYN packets without completing handshake
Impact: Exhausts connection table memory, prevents legitimate connections
Scale: Modern attacks reach 100M+ SYN packets/second
Defense layers:
1. SYN Cookies (Kernel-level):
- Don't allocate state until handshake completes
- Encode connection info in initial sequence number
- Verify on ACK receipt (stateless until connection established)
- Capacity: Unlimited SYNs without memory exhaustion
- Trade-off: Loses TCP options (window scaling, SACK) until validated
2. SYN Proxy (LB-level):
- LB completes TCP handshake with client
- Only forwards to backend after valid ACK received
- Absorbs SYN flood without backend impact
- Memory: Small per-SYN state (64 bytes vs 512 bytes full connection)
3. Rate Limiting SYNs per source IP:
- Threshold: 100 SYNs/second per IP (configurable)
- Action: DROP excess SYNs, log source IP
- Whitelist: Known good IPs exempt from limit
4. XDP/eBPF SYN validation:
- Process SYN packets at NIC driver level
- Drop invalid SYNs before kernel processing
- Capacity: 10M+ packets/sec per core
- Used by: Cloudflare, Facebook (Katran)
Configuration example:
syn_flood_protection:
enabled: true
syn_cookies: true
syn_proxy: true
max_syn_backlog: 65536
syn_rate_limit_per_ip: 100
syn_rate_limit_global: 1000000
xdp_validation: trueAmplification Attack Protection
Attack: Attacker spoofs victim's IP, sends requests to amplifiers (DNS, NTP, memcached)
Impact: Victim receives massive amplified response traffic (100x amplification possible)
Scale: Largest recorded: 3.47 Tbps (2022)
Defense at LB layer:
1. Ingress filtering (BCP38):
- Verify source IP is routable and not spoofed
- Drop packets with source IPs from own network (reflection)
- Implement uRPF (Unicast Reverse Path Forwarding)
2. Protocol-specific rate limiting:
- DNS response rate limiting: 1000 responses/sec per destination
- NTP monlist blocking: Drop NTP mode 7 packets
- Memcached: Block UDP port 11211 from internet
3. Traffic scrubbing:
- Divert suspicious traffic through scrubbing center
- Analyze traffic patterns, drop attack traffic
- Forward clean traffic to LB
- Providers: Cloudflare, AWS Shield, Akamai Prolexic
4. Anycast absorption:
- Distribute attack across multiple PoPs
- Each PoP absorbs fraction of attack
- 100 PoPs × 10 Gbps each = 1 Tbps absorption capacity
5. Blackhole routing (last resort):
- Advertise /32 route to null for attacked IP
- Drops ALL traffic (attack + legitimate)
- Used when attack threatens network infrastructureSlowloris and Slow HTTP Attacks
Attack: Open many connections, send data very slowly to exhaust connection slots
Impact: Ties up all available connections, denies service to legitimate users
Variants: Slow headers, slow POST body, slow read
Defense mechanisms:
1. Connection timeouts:
- Header timeout: 10 seconds (must receive complete headers)
- Body timeout: 30 seconds (must receive complete body)
- Idle timeout: 60 seconds (no data = close connection)
- Minimum data rate: 100 bytes/second (below = close)
2. Connection limits per IP:
- Max concurrent connections per IP: 100
- Max new connections per IP per second: 20
- Exempt: Known good IPs, internal services
3. Request size limits:
- Max header size: 8KB
- Max request line: 4KB
- Max number of headers: 100
- Max body size: 10MB (configurable per route)
4. Behavioral detection:
- Track connection duration vs data transferred
- Flag connections with abnormally low throughput
- Score-based system: multiple slow indicators = block
Configuration:
slowloris_protection:
header_timeout_seconds: 10
body_timeout_seconds: 30
idle_timeout_seconds: 60
min_data_rate_bytes_per_second: 100
max_connections_per_ip: 100
max_new_connections_per_ip_per_second: 20HTTP Flood (Layer 7 DDoS)
Attack: Legitimate-looking HTTP requests at massive scale
Impact: Overwhelms application layer (harder to distinguish from real traffic)
Scale: 10M+ requests/second from botnets
Defense mechanisms:
1. Rate limiting (progressive):
- Tier 1: 100 req/sec per IP (soft limit, add CAPTCHA)
- Tier 2: 500 req/sec per IP (hard limit, block)
- Tier 3: 10,000 req/sec global per URL (protect specific endpoints)
2. Challenge-response:
- JavaScript challenge (blocks simple bots)
- CAPTCHA for suspicious traffic
- Proof-of-work challenge (computational cost for client)
3. Behavioral analysis:
- Request pattern analysis (uniform timing = bot)
- Browser fingerprinting (headless browsers detected)
- TLS fingerprinting (JA3/JA4 hash identifies bot libraries)
- Mouse/keyboard interaction tracking (for web applications)
4. Reputation-based blocking:
- IP reputation databases (Spamhaus, AbuseIPDB)
- ASN reputation (hosting providers used by botnets)
- Geographic anomaly detection (sudden traffic from unusual regions)
- Device fingerprint reputation
5. Adaptive rate limiting:
- Normal: 100 req/sec per IP
- Under attack: 10 req/sec per IP (tighten automatically)
- Detection: Global request rate exceeds 3x normal for 60 seconds
- Recovery: Gradually relax limits after attack subsidesSSL/TLS Termination and Re-encryption
TLS Termination at Load Balancer
Architecture:
Client <--[TLS 1.3]--> LB <--[plaintext or TLS]--> Backend
Benefits:
- Centralized certificate management (one place to update certs)
- Offload CPU-intensive crypto from backends
- Enable L7 inspection (routing, WAF, logging)
- Better TLS configuration (enforce modern ciphers centrally)
- Session resumption across backends (shared session cache)
TLS Configuration (production-grade):
tls:
min_version: TLSv1.2
max_version: TLSv1.3
cipher_suites:
# TLS 1.3 (always preferred)
- TLS_AES_256_GCM_SHA384
- TLS_CHACHA20_POLY1305_SHA256
- TLS_AES_128_GCM_SHA256
# TLS 1.2 (for compatibility)
- ECDHE-ECDSA-AES256-GCM-SHA384
- ECDHE-RSA-AES256-GCM-SHA384
- ECDHE-ECDSA-CHACHA20-POLY1305
- ECDHE-RSA-CHACHA20-POLY1305
ecdh_curves:
- X25519
- P-256
- P-384
session_tickets: true
session_timeout: 3600
ocsp_stapling: true
hsts:
enabled: true
max_age: 31536000
include_subdomains: true
preload: true
Performance impact:
- RSA-2048 handshake: ~1ms CPU time
- ECDSA P-256 handshake: ~0.2ms CPU time
- Session resumption: ~0.05ms (skip key exchange)
- TLS 1.3 0-RTT: Zero additional latency for repeat connections
- Bulk encryption (AES-GCM): ~1 Gbps per core with AES-NIBackend Re-encryption (TLS to Backend)
Architecture:
Client <--[TLS 1.3]--> LB <--[TLS 1.2/1.3]--> Backend
When to use:
- Compliance requirements (data encrypted in transit everywhere)
- Zero-trust network model
- Backend in different security zone
- Regulatory requirements (PCI-DSS, HIPAA)
Configuration:
backend_tls:
enabled: true
verify_certificate: true
ca_certificate: "/certs/internal-ca.pem"
client_certificate: "/certs/lb-client.pem" # mTLS
client_key: "/certs/lb-client-key.pem"
sni_hostname: "backend.internal.example.com"
min_version: TLSv1.2
cipher_suites:
- ECDHE-ECDSA-AES256-GCM-SHA384
Performance impact:
- Additional ~0.5ms latency per request (backend TLS handshake)
- Mitigated with connection pooling (reuse TLS connections)
- ~10% CPU overhead on LB for re-encryptionSNI-Based Routing
How: Use TLS Server Name Indication to route before decryption
Use case: Multiple domains on same IP, route to different backends
Process:
1. Client sends ClientHello with SNI = "api.example.com"
2. LB reads SNI (unencrypted in TLS 1.2, encrypted in ECH/TLS 1.3)
3. LB selects certificate and backend pool based on SNI
4. TLS handshake completes with correct certificate
Configuration:
sni_routing:
- sni: "api.example.com"
certificate: "cert-api"
pool: "pool-api-production"
- sni: "web.example.com"
certificate: "cert-web"
pool: "pool-web-production"
- sni: "*.example.com"
certificate: "cert-wildcard"
pool: "pool-default"
- default:
action: "reject" # No matching SNI = connection refusedWAF (Web Application Firewall) Integration
WAF at Load Balancer Layer
Architecture:
Client -> LB (TLS termination) -> WAF Engine -> Routing -> Backend
Inspection points:
- Request headers (Host, User-Agent, Cookie, Authorization)
- Request URL and query parameters
- Request body (POST data, JSON, XML, multipart)
- Response headers and body (optional, for data leak prevention)
Rule categories:
1. OWASP Core Rule Set (CRS):
- SQL Injection detection (pattern matching + libinjection)
- Cross-Site Scripting (XSS) detection
- Remote Code Execution patterns
- Local/Remote File Inclusion
- Command Injection
2. Protocol enforcement:
- Valid HTTP method (block TRACE, CONNECT for web apps)
- Content-Type validation
- Request size limits
- Character encoding validation
- Multipart form validation
3. Bot detection:
- Known bad User-Agents
- Missing expected headers (Accept, Accept-Language)
- TLS fingerprint analysis (JA3)
- Request timing analysis
4. Custom rules:
- Business logic protection (rate limit login attempts)
- API schema validation (reject malformed JSON)
- Geographic restrictions
- Time-based access control
Performance impact:
- Simple pattern matching: <0.1ms per request
- Full CRS evaluation: 1-5ms per request
- Body inspection (large payloads): 5-20ms per request
- Recommendation: Inspect headers always, body selectively
Configuration:
waf:
enabled: true
mode: "BLOCK" # DETECT (log only) or BLOCK (reject)
rule_sets:
- owasp_crs_v4
- custom_api_rules
exclusions:
- path: "/api/upload"
rules: ["body_size_limit"]
- path: "/webhooks/*"
rules: ["sql_injection"] # Webhook payloads trigger false positives
anomaly_scoring:
threshold: 5 # Block if cumulative score >= 5
per_rule_score: 1-5 # Based on severityIP Allowlisting and Blocklisting
Implementation Architecture
Data structures for fast IP lookup:
1. Exact IP match: Hash set
- O(1) lookup
- Memory: 10M IPs × 4 bytes = 40 MB (IPv4)
2. CIDR range match: Radix tree (Patricia trie)
- O(32) lookup for IPv4, O(128) for IPv6
- Memory: 100K ranges × 64 bytes = 6.4 MB
3. Country/ASN match: Pre-computed GeoIP database
- O(1) lookup (binary search on sorted ranges)
- Memory: ~50 MB (MaxMind database)
Processing order (short-circuit evaluation):
1. Check allowlist (if match -> ALLOW, skip remaining checks)
2. Check blocklist (if match -> DENY)
3. Check rate limits
4. Check WAF rules
5. Default: ALLOWDynamic Blocklist Management
Sources of blocklist entries:
1. Automated detection:
- Rate limit violations (auto-block after 3 violations in 1 hour)
- WAF rule triggers (auto-block after 10 attacks in 5 minutes)
- Failed authentication attempts (auto-block after 50 failures)
- Port scanning detection
2. Threat intelligence feeds:
- Spamhaus DROP/EDROP lists (updated hourly)
- AbuseIPDB (community-reported IPs)
- Internal threat intelligence
- Tor exit nodes (optional, context-dependent)
3. Manual entries:
- Operator-added blocks (with expiry)
- Incident response blocks (immediate, reviewed within 24h)
Auto-expiry:
- Rate limit blocks: 1 hour (escalating: 1h, 4h, 24h, 7d)
- WAF blocks: 24 hours
- Threat intel: Until removed from feed
- Manual blocks: Configurable (default 30 days)
API for blocklist management:
POST /api/v1/acls/blocklist
{
"ip": "203.0.113.50",
"cidr": "203.0.113.0/24", // or specific IP
"reason": "Automated: rate_limit_violation",
"source": "auto_detection",
"expires_at": "2024-01-21T15:00:00Z",
"severity": "medium"
}Rate Limiting at LB Layer
Multi-Tier Rate Limiting
Tier 1: Global rate limit (protect infrastructure)
- 5M requests/second total capacity
- Action: Return 503 when exceeded
- Purpose: Prevent total system overload
Tier 2: Per-IP rate limit (prevent abuse)
- 100 requests/second per source IP
- Burst: 200 requests (token bucket)
- Action: Return 429 with Retry-After header
- Purpose: Prevent single-source abuse
Tier 3: Per-endpoint rate limit (protect specific APIs)
- /api/login: 5 requests/minute per IP
- /api/search: 30 requests/minute per IP
- /api/upload: 10 requests/minute per user
- Action: Return 429 with specific error message
Tier 4: Per-user rate limit (authenticated)
- Free tier: 1000 requests/hour
- Pro tier: 10000 requests/hour
- Enterprise: Custom limits
- Key: API key or JWT subject claim
Implementation:
Algorithm: Token bucket (per-IP) + sliding window (per-endpoint)
Storage: In-memory hash map with LRU eviction
Distributed: Approximate local counting + periodic Redis sync
Accuracy: ±10% (acceptable for rate limiting)
Response headers (RFC 6585 compliant):
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1705766400
Retry-After: 30Distributed Rate Limiting
Challenge: 200 LB instances must enforce global rate limits
Approach 1: Local approximation
- Each instance enforces limit/N (where N = instance count)
- Simple but inaccurate (±50% if traffic unevenly distributed)
- No coordination overhead
Approach 2: Periodic sync via Redis
- Each instance maintains local counter
- Every 1 second, sync to Redis (INCRBY)
- Read global count from Redis
- Accuracy: ±10% (1-second window of drift)
- Latency: No impact on request path (async sync)
Approach 3: Token bucket in Redis (exact)
- Every request checks Redis (Lua script for atomicity)
- Exact enforcement across all instances
- Latency: +0.5ms per request (Redis round-trip)
- Use only for critical limits (login, payment)
Recommended: Approach 2 for most limits, Approach 3 for security-critical endpointsmTLS for Backend Communication
Mutual TLS Architecture
Purpose: Verify both LB and backend identity (zero-trust networking)
Certificate hierarchy:
Root CA (offline, HSM-protected)
└── Intermediate CA (online, short-lived)
├── LB Client Certificate (identifies LB to backends)
├── Backend Server Certificate (identifies backend to LB)
└── Service Certificates (for service-to-service)
Handshake flow:
1. LB connects to backend, presents client certificate
2. Backend verifies LB certificate against trusted CA
3. Backend presents server certificate
4. LB verifies backend certificate against trusted CA
5. Both parties authenticated, encrypted channel established
Configuration (LB side):
backend_mtls:
client_certificate: "/certs/lb-client.pem"
client_key: "/certs/lb-client-key.pem"
ca_certificate: "/certs/internal-ca-bundle.pem"
verify_backend: true
allowed_sans: # Only connect to backends with these SANs
- "*.api.internal.example.com"
- "*.worker.internal.example.com"
crl_url: "http://pki.internal/crl.pem"
ocsp_url: "http://pki.internal/ocsp"
Benefits:
- Prevents unauthorized backends from receiving traffic
- Prevents unauthorized LBs from connecting to backends
- Encrypted communication even on internal network
- Audit trail (certificate identity in logs)
- Compliance (PCI-DSS, SOC2, HIPAA)Certificate Rotation for mTLS
Rotation strategy: Dual-certificate overlap period
Timeline:
Day 0: Generate new certificate (valid from Day 0)
Day 0: Deploy new cert alongside old cert (both valid)
Day 1-7: Gradually shift to new certificate
Day 7: Remove old certificate
Day 30: Old certificate expires (safety margin)
Automated rotation:
- Certificate lifetime: 90 days
- Rotation trigger: 30 days before expiry
- Rotation method: Rolling deployment (no downtime)
- Monitoring: Alert if cert expires within 14 days
Tools:
- cert-manager (Kubernetes): Automatic rotation with Let's Encrypt
- Vault PKI: Internal CA with short-lived certificates (24h)
- SPIFFE/SPIRE: Identity framework with automatic cert rotationCertificate Management and Rotation
Certificate Lifecycle Management
Stages:
1. Generation/Procurement
- ACME (Let's Encrypt): Automated, free, 90-day validity
- Commercial CA: Manual, paid, 1-year validity
- Internal CA: Automated, internal only, configurable validity
2. Deployment
- Push to all LB instances (configuration management)
- Verify deployment (check all instances serving new cert)
- Rollback plan (keep old cert available for 24h)
3. Monitoring
- Expiry tracking (alert at 30, 14, 7, 1 days)
- Certificate transparency log monitoring
- OCSP response monitoring
- Certificate revocation monitoring
4. Renewal
- Automated renewal 30 days before expiry
- Validation: HTTP-01, DNS-01, or TLS-ALPN-01 challenge
- Deployment: Rolling update across LB fleet
- Verification: Confirm new cert served on all instances
5. Revocation (emergency)
- Publish to CRL (Certificate Revocation List)
- Update OCSP responder
- Deploy replacement certificate immediately
- Notify affected parties
Automation:
certificate_management:
provider: "letsencrypt"
auto_renew: true
renewal_days_before_expiry: 30
challenge_type: "dns-01"
dns_provider: "route53"
deployment_strategy: "rolling"
rollback_on_failure: true
monitoring:
expiry_warning_days: [30, 14, 7, 1]
alert_channel: "pagerduty"Multi-Domain and Wildcard Certificates
Strategy for large-scale deployments:
Option 1: Wildcard certificate
- *.example.com covers all subdomains
- Single cert for api.example.com, web.example.com, etc.
- Simpler management (one cert to rotate)
- Risk: Compromise affects all subdomains
Option 2: Per-service certificates
- api.example.com has its own certificate
- web.example.com has its own certificate
- Better isolation (compromise limited to one service)
- More complex management (many certs to track)
Option 3: SAN (Subject Alternative Name) certificates
- Single cert with multiple domains listed
- api.example.com, web.example.com, admin.example.com
- Compromise: Between wildcard and per-service
- Limitation: Must reissue to add/remove domains
Recommendation for production:
- Wildcard for internal services (*.internal.example.com)
- Per-service certs for external-facing services
- Separate certs for different security zones
- Short-lived certs (90 days) to limit exposure windowAccess Logging and Audit Trails
Access Log Format
Log entry per request (structured JSON):
{
"timestamp": "2024-01-20T15:00:00.123Z",
"request_id": "req-abc123-def456",
"client_ip": "203.0.113.50",
"client_port": 54321,
"server_ip": "10.0.42.100",
"server_port": 8080,
"method": "POST",
"host": "api.example.com",
"path": "/v1/users",
"query": "page=1",
"protocol": "HTTP/2",
"status_code": 201,
"request_size_bytes": 1024,
"response_size_bytes": 256,
"request_time_ms": 45.2,
"upstream_time_ms": 42.1,
"ssl_protocol": "TLSv1.3",
"ssl_cipher": "TLS_AES_256_GCM_SHA384",
"user_agent": "Mozilla/5.0...",
"referer": "https://web.example.com/dashboard",
"x_forwarded_for": "203.0.113.50",
"lb_instance": "lb-us-east-1a-042",
"pool_id": "pool-api-production",
"backend_server": "backend-us-east-1a-api-0042",
"rate_limited": false,
"waf_action": "PASS",
"waf_rules_matched": [],
"geo_country": "US",
"geo_city": "New York",
"asn": 15169,
"connection_reused": true,
"compression": "gzip"
}
Volume: 1.16M entries/second × 1KB = 1.16 GB/s of logs
Storage: 100 TB/day (uncompressed), 10 TB/day (compressed)
Retention: 30 days hot (Elasticsearch), 1 year cold (S3)Security Audit Trail
Audit events (control plane actions):
{
"event_id": "audit-abc123",
"timestamp": "2024-01-20T15:00:00Z",
"actor": {
"type": "user",
"id": "admin@example.com",
"ip": "10.0.1.50",
"auth_method": "mTLS"
},
"action": "backend.remove",
"resource": {
"type": "backend_server",
"id": "backend-us-east-1a-api-0042",
"pool": "pool-api-production"
},
"details": {
"drain_timeout": 300,
"active_connections_at_removal": 0,
"reason": "Scheduled decommission"
},
"result": "SUCCESS",
"changes": {
"before": {"status": "DRAINING", "connections": 0},
"after": {"status": "REMOVED"}
}
}
Audit events to capture:
- Configuration changes (routing rules, ACLs, rate limits)
- Backend additions/removals
- Certificate uploads/rotations
- Health check overrides
- Rate limit exemptions
- Emergency blocks/unblocks
- Scaling events
- Failover events
Storage:
- Immutable append-only log
- Cryptographically signed entries
- Retention: 7 years (compliance)
- Access: Read-only for most users, append-only for system
- Tamper detection: Hash chain (each entry includes hash of previous)Privacy Considerations
Data minimization:
- Log client IPs (required for security)
- Do NOT log request bodies (may contain PII)
- Do NOT log Authorization header values
- Do NOT log cookie values (session tokens)
- Truncate User-Agent to 256 characters
- Hash or mask sensitive query parameters
GDPR compliance:
- Right to erasure: Ability to purge logs for specific IP/user
- Data retention: Automatic deletion after retention period
- Data access: Provide logs related to specific user on request
- Data minimization: Only log what's necessary for security/operations
PCI-DSS compliance:
- Never log full credit card numbers
- Mask PAN in any logged data
- Encrypt logs at rest
- Restrict log access to authorized personnel
- Retain logs for minimum 1 year
Log sanitization pipeline:
1. Raw log generated (full data)
2. Sanitization filter removes/masks sensitive fields
3. Sanitized log written to storage
4. Raw log discarded (never persisted)
Sanitization rules:
- Authorization header: Replace value with "[REDACTED]"
- Cookie header: Replace value with "[REDACTED]"
- Query params matching /password|token|secret|key/: Mask value
- Request body: Never logged (only size)Security Hardening Checklist
Network Security
□ LB management interface on separate network (not internet-facing)
□ Backend servers not directly accessible from internet
□ Management API requires mTLS or VPN access
□ ICMP rate limited (prevent ping flood)
□ Unused ports closed (only 80, 443 exposed)
□ IPv6 security equivalent to IPv4
□ BGP session authentication (MD5 or TCP-AO)
□ ARP spoofing protection on LB network segmentApplication Security
□ TLS 1.2+ only (TLS 1.0/1.1 disabled)
□ Strong cipher suites only (no RC4, DES, 3DES, NULL)
□ HSTS enabled with long max-age
□ X-Frame-Options: DENY
□ X-Content-Type-Options: nosniff
□ Content-Security-Policy headers added
□ Server header removed or generic
□ Error pages don't leak internal information
□ Request size limits enforced
□ Timeout values configured (prevent resource exhaustion)Operational Security
□ Principle of least privilege for all access
□ Multi-factor authentication for management access
□ API keys rotated every 90 days
□ Audit logging enabled and monitored
□ Automated vulnerability scanning
□ Security patches applied within 24 hours (critical)
□ Incident response plan documented and tested
□ Regular penetration testing (quarterly)
□ Configuration drift detection
□ Secrets stored in vault (not in config files)This security design ensures the load balancer serves as a robust security boundary, protecting backend services from external threats while maintaining high performance and operational visibility.