Trade-offs & Alternatives

📖 6 min read 📄 Part 7 of 10

Distributed Counter - Trade-offs and Alternatives

Accuracy vs Performance

Strong Consistency (100% Accurate)

Implementation:
- Synchronous updates across all replicas
- Distributed locks for atomic operations
- Quorum-based writes

Pros:
✓ 100% accurate
✓ No lost updates
✓ Predictable behavior

Cons:
✗ High latency (5-10ms)
✗ Lower throughput (100K ops/s)
✗ Reduced availability
✗ Complex implementation

Use Cases:
- Financial counters
- Voting systems
- Inventory counts
- Critical metrics

Cost: 5x infrastructure

Eventual Consistency (99%+ Accurate)

Implementation:
- Asynchronous replication
- Local updates, background sync
- CRDT for conflict resolution

Pros:
✓ Low latency (<1ms)
✓ High throughput (1M+ ops/s)
✓ High availability
✓ Simple implementation

Cons:
✗ Temporary inconsistency
✗ 1-5% error possible
✗ Convergence delay (1s)

Use Cases:
- Page views
- Analytics
- Social media metrics
- Non-critical counters

Cost: 1x infrastructure

Recommendation: Eventual consistency for most use cases

Approximate Counting (95%+ Accurate)

Implementation:
- Probabilistic data structures
- HyperLogLog, Count-Min Sketch
- Sampling techniques

Pros:
✓ Very low latency (<0.5ms)
✓ Very high throughput (10M+ ops/s)
✓ Constant memory usage
✓ Extremely scalable

Cons:
✗ 1-5% error rate
✗ Cannot get exact count
✗ Cannot decrement
✗ Limited operations

Use Cases:
- Unique visitors
- Real-time dashboards
- Approximate analytics
- High-cardinality metrics

Cost: 0.1x infrastructure

Recommendation: Use for unique counts and high-scale analytics

Storage Models

In-Memory Only (Redis)

Pros:
✓ Fastest (< 1ms latency)
✓ Simple implementation
✓ High throughput

Cons:
✗ Data loss on restart
✗ Limited by RAM
✗ Expensive at scale

Cost: $0.50/GB/month
Capacity: 100M counters = 10 GB = $5/month

Use When:
- Temporary counters
- Can rebuild from source
- Speed is critical

Hybrid (Redis + Database)

Pros:
✓ Fast reads (cache)
✓ Durable writes (database)
✓ Best of both worlds

Cons:
✗ More complex
✗ Cache invalidation
✗ Sync overhead

Cost: $0.50/GB (Redis) + $0.10/GB (DB)
Capacity: 10 GB Redis + 100 GB DB = $15/month

Use When:
- Need speed and durability
- Most production use cases
- Recommended approach

Database Only (PostgreSQL)

Pros:
✓ Durable
✓ ACID guarantees
✓ Complex queries

Cons:
✗ Slower (5-10ms latency)
✗ Lower throughput
✗ Scaling challenges

Cost: $0.10/GB/month
Capacity: 100 GB = $10/month

Use When:
- Durability critical
- Complex analytics needed
- Lower traffic

Counter Types

Simple Counter vs Sharded Counter

Simple Counter:
Structure: Single value
Throughput: 10K ops/s
Latency: <1ms
Complexity: Low
Use: Low-traffic counters

Sharded Counter:
Structure: Multiple shards
Throughput: 100K+ ops/s
Latency: <1ms (write), <5ms (read)
Complexity: Medium
Use: High-traffic counters

Trade-off:
- Simple: Easy but limited throughput
- Sharded: Complex but scalable

Recommendation: Start simple, shard when needed (>10K ops/s)

Exact Counter vs Approximate Counter

Exact Counter:
Accuracy: 100%
Memory: O(n) per counter
Operations: Increment, decrement, get
Use: Financial, voting

Approximate Counter (HyperLogLog):
Accuracy: 99%
Memory: O(1) - 12KB fixed
Operations: Add, count (no decrement)
Use: Unique counts, analytics

Trade-off:
- Exact: Accurate but expensive
- Approximate: Efficient but imprecise

Recommendation: Approximate for unique counts, exact for everything else

Synchronization Strategies

Synchronous Replication

Process:
1. Write to primary
2. Wait for replicas to acknowledge
3. Return success

Pros:
✓ Strong consistency
✓ No data loss
✓ Immediate visibility

Cons:
✗ High latency (+5ms)
✗ Lower availability
✗ Reduced throughput

Latency: 5-10ms
Throughput: 100K ops/s
Consistency: Strong

Asynchronous Replication

Process:
1. Write to primary
2. Return success immediately
3. Replicate in background

Pros:
✓ Low latency (<1ms)
✓ High availability
✓ High throughput

Cons:
✗ Eventual consistency
✗ Possible data loss
✗ Replication lag

Latency: <1ms
Throughput: 1M+ ops/s
Consistency: Eventual (1s)

Recommendation: Async for most use cases

Gossip Protocol

Process:
1. Each server updates locally
2. Periodically gossip with peers
3. Merge using CRDT rules

Pros:
✓ Decentralized
✓ Partition-tolerant
✓ Self-healing

Cons:
✗ Slower convergence
✗ More network traffic
✗ Complex debugging

Convergence: O(log N) rounds
Latency: <1ms local
Consistency: Eventual (1-5s)

Use When: Multi-region, high availability required

Time-Windowed Counters

Fixed Window vs Sliding Window

Fixed Window:
Window: 10:00:00 - 10:59:59
Pros: Simple, low memory
Cons: Boundary issues (burst at edges)

Example Issue:
10:59:59  1000 requests (allowed)
11:00:00  1000 requests (new window, allowed)
Total: 2000 requests in 1 second!

Sliding Window:
Window: Last 3600 seconds from now
Pros: Accurate, no boundary issues
Cons: More complex, higher memory

Recommendation: Sliding window for rate limiting, fixed for analytics

Bucket Granularity

1-second buckets:
- Memory: 3600 buckets × 24 bytes = 86 KB
- Accuracy: Very high
- Use: Rate limiting

1-minute buckets:
- Memory: 60 buckets × 24 bytes = 1.4 KB
- Accuracy: Good
- Use: Analytics

1-hour buckets:
- Memory: 24 buckets × 24 bytes = 576 bytes
- Accuracy: Moderate
- Use: Dashboards

Trade-off: Accuracy vs memory usage
Recommendation: 1-minute buckets for most use cases

Distributed Counter Alternatives

Redis INCR vs Sharded Counter

Redis INCR:
Throughput: 100K ops/s per key
Latency: <1ms
Complexity: Low
Bottleneck: Single key

Sharded Counter:
Throughput: 1M+ ops/s
Latency: <1ms (write), <5ms (read)
Complexity: Medium
Bottleneck: Aggregation

When to use:
- Redis INCR: <100K ops/s
- Sharded: >100K ops/s

CRDT vs Centralized Counter

CRDT (Conflict-free Replicated Data Type):
Pros:
 No coordination needed
 Partition-tolerant
 Eventually consistent
 Decentralized

Cons:
 Memory grows with servers
 Cannot decrement (G-Counter)
 Complex implementation

Centralized Counter:
Pros:
 Simple implementation
 Exact counts
 Full operations

Cons:
 Single point of failure
 Coordination overhead
 Lower availability

Recommendation: CRDT for multi-region, centralized for single region

Cost vs Performance

High Performance (Expensive)

Configuration:
- All in-memory (Redis)
- Synchronous replication
- Dedicated servers
- Premium network

Cost: $50K/month
Throughput: 10M ops/s
Latency: <1ms P99
Accuracy: 100%

Use When: Critical, high-traffic counters

Balanced (Recommended)

Configuration:
- Hybrid (Redis + Database)
- Asynchronous replication
- Shared servers
- Standard network

Cost: $10K/month
Throughput: 3M ops/s
Latency: <2ms P99
Accuracy: 99%+

Use When: Most production use cases

Cost-Optimized (Cheap)

Configuration:
- Database only
- No replication
- Spot instances
- Basic network

Cost: $2K/month
Throughput: 500K ops/s
Latency: <10ms P99
Accuracy: 100%

Use When: Low-traffic, non-critical counters

Technology Choices

Redis vs Memcached

Redis:
 Atomic operations (INCR, INCRBY)
 Persistence (RDB, AOF)
 Replication
 Lua scripting
 Single-threaded

Memcached:
 Multi-threaded
 Faster for simple ops
 No persistence
 No atomic operations
 Limited data structures

Recommendation: Redis for counters

PostgreSQL vs Cassandra

PostgreSQL:
✓ ACID transactions
✓ Complex queries
✓ Strong consistency
✗ Harder to scale horizontally

Cassandra:
✓ Linear scalability
✓ High availability
✓ Multi-region
✗ Eventual consistency
✗ Limited queries

Recommendation: PostgreSQL for exact counters, Cassandra for approximate

These trade-offs help make informed decisions based on specific requirements, constraints, and priorities for distributed counting systems.