Distributed Unique ID Generator - Interview Tips
Interview Approach
Problem Clarification (5 minutes)
Start by asking these key questions:
Scale Questions:
- "How many IDs per second do we need to generate?"
- "What's the expected peak load?"
- "How many datacenters/regions?"
- "What's the expected system lifetime?"
Requirements Questions:
- "Do IDs need to be sortable by time?"
- "Is approximate ordering acceptable?"
- "What's the acceptable latency?"
- "Do we need to support offline ID generation?"
Constraints Questions:
- "Can we use 64-bit integers or do we need 128-bit?"
- "Is clock synchronization available (NTP)?"
- "Can we assign unique worker IDs to nodes?"
- "What's the tolerance for clock drift?"
High-Level Design (10 minutes)
Start with the Basics
"I'll design a Snowflake-style ID generator with these components:
1. ID Format (64-bit):
- 41 bits: Timestamp (milliseconds)
- 5 bits: Datacenter ID (32 datacenters)
- 5 bits: Worker ID (32 workers per DC)
- 12 bits: Sequence (4,096 IDs/ms)
2. Key Properties:
- No coordination between nodes
- Time-ordered IDs
- High throughput (10K+ IDs/sec per node)
- Low latency (<1ms)
3. Architecture:
- Stateless ID generator nodes
- Load balancer for distribution
- Optional coordination service for worker IDs
- Monitoring and metrics"Draw the Architecture
┌─────────────┐
│ Clients │
└──────┬──────┘
│
┌──────┴──────┐
│Load Balancer│
└──────┬──────┘
│
┌───┴───┬───────┬───────┐
│ │ │ │
┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐
│Node1│ │Node2│ │Node3│ │NodeN│
│WID:1│ │WID:2│ │WID:3│ │WID:N│
└─────┘ └─────┘ └─────┘ └─────┘Deep Dive (20-25 minutes)
ID Format Design
"Let me explain the 64-bit ID structure:
Bit Layout:
┌─────────────────────────────────────────────────────┐
│ 0 │ 41-bit Timestamp │ 5-bit DC │ 5-bit Worker │ 12-bit Seq │
└─────────────────────────────────────────────────────┘
Timestamp (41 bits):
- Milliseconds since custom epoch (2020-01-01)
- 2^41 ms = 69.7 years
- Provides time-ordering
- Allows efficient time-range queries
Datacenter ID (5 bits):
- 32 datacenters maximum
- Geographic distribution
- Regional isolation
- Disaster recovery
Worker ID (5 bits):
- 32 workers per datacenter
- 1,024 total workers globally
- No coordination between workers
- Unique per node
Sequence (12 bits):
- 4,096 IDs per millisecond per worker
- Resets each millisecond
- Handles burst traffic
- Overflow waits for next millisecond"Key Algorithms
"Here's the core generation algorithm:
def generate_id(self):
timestamp = current_time_ms()
# Handle clock regression
if timestamp < self.last_timestamp:
raise ClockRegressionError()
# Same millisecond - increment sequence
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 0xFFF
# Sequence overflow - wait
if self.sequence == 0:
timestamp = wait_next_ms()
else:
# New millisecond - reset sequence
self.sequence = 0
self.last_timestamp = timestamp
# Build ID: [timestamp][dc][worker][seq]
return ((timestamp - EPOCH) << 22) | \
(self.datacenter_id << 17) | \
(self.worker_id << 12) | \
self.sequence
Key points:
- Lock-free with atomic operations
- Handles clock regression
- Manages sequence overflow
- Sub-millisecond latency"Handling Tough Questions (10 minutes)
Q: "What if the clock goes backwards?"
"Great question! Clock regression is a critical issue.
Detection:
- Compare current time with last timestamp
- If current < last, clock moved backwards
Handling Options:
1. Refuse to Generate (Preferred):
- Throw error immediately
- Alert operations team
- Wait for clock to catch up
- Prevents duplicate IDs
2. Use Last Timestamp:
- Continue with last known time
- Increment sequence
- Risk of sequence overflow
- Temporary solution
3. Wait It Out:
- If regression < 5ms, wait
- Sleep until clock catches up
- Resume normal operation
- Acceptable for small regressions
Prevention:
- Multiple NTP servers
- Monitor clock drift
- Alert on drift > 100ms
- Refuse generation if drift > 1 second"Q: "How do you ensure uniqueness?"
"Uniqueness is guaranteed by the combination of:
1. Timestamp:
- Unique per millisecond
- Monotonically increasing
- 41 bits = 69 years
2. Datacenter + Worker ID:
- Unique per node
- 10 bits = 1,024 unique nodes
- No two nodes share same ID
3. Sequence:
- Unique within millisecond
- 12 bits = 4,096 per ms
- Resets each millisecond
Mathematical Proof:
- ID = f(timestamp, datacenter, worker, sequence)
- Each component unique in its dimension
- Combination guarantees global uniqueness
- Collision probability: 0 (with proper worker ID management)
Edge Cases:
- Clock regression: Detected and prevented
- Worker ID conflicts: Coordination service prevents
- Sequence overflow: Wait for next millisecond"Q: "How does this scale to millions of IDs per second?"
"Excellent scaling question!
Current Capacity:
- Single worker: 4,096 IDs/ms = 4M IDs/sec theoretical
- Practical: 10K IDs/sec per worker
- 100 workers: 1M IDs/sec
- 1,000 workers: 10M IDs/sec
Scaling Strategy:
1. Horizontal Scaling:
- Add more workers
- Linear scaling
- No coordination overhead
- Each worker independent
2. Geographic Distribution:
- Workers across datacenters
- Reduced latency
- Regional isolation
- Disaster recovery
3. Load Balancing:
- Round-robin distribution
- Geographic routing
- Health-based routing
- Auto-scaling
Bottlenecks:
- Worker ID space: 1,024 workers max
- Solution: Use 128-bit IDs for unlimited workers
- Sequence overflow: 4,096/ms per worker
- Solution: Add more workers
Cost:
- $100/worker/month
- 1M IDs/sec = 100 workers = $10K/month
- $0.01 per million IDs
- Very cost-effective"Common Mistakes to Avoid
❌ Don't Say:
"We'll use a database sequence"
→ Shows lack of understanding of distributed systems
"Clock synchronization isn't important"
→ Critical for time-based IDs
"We don't need to handle clock regression"
→ Will cause duplicate IDs
"Worker IDs can be random"
→ Defeats the purpose of uniqueness guarantee
"We can generate unlimited IDs per millisecond"
→ Ignores sequence bit limitations✅ Do Say:
"We'll use a Snowflake-style approach for coordination-free generation"
→ Shows knowledge of industry standards
"Clock synchronization via NTP is critical, with monitoring for drift"
→ Demonstrates understanding of dependencies
"We need to detect and handle clock regression to prevent duplicates"
→ Shows attention to edge cases
"Worker IDs must be unique and managed via coordination service"
→ Understands the uniqueness guarantee
"We're limited to 4,096 IDs per millisecond per worker, so we scale horizontally"
→ Understands capacity and scalingTime Management Tips
0-5 min: Requirements clarification
- Ask about scale, latency, ordering
- Clarify constraints (64-bit vs 128-bit)
- Understand use cases
5-15 min: High-level design
- Draw architecture diagram
- Explain ID format
- Discuss key components
- Cover basic algorithm
15-30 min: Deep dive
- Detailed algorithm
- Handle edge cases
- Discuss scaling
- Cover monitoring
30-40 min: Advanced topics
- Security considerations
- Alternative approaches
- Tradeoffs discussion
- Follow-up questions
40-45 min: Wrap up
- Summarize design
- Highlight key decisions
- Discuss next stepsKey Points to Emphasize
Technical Excellence
✓ No coordination required
✓ Sub-millisecond latency
✓ Linear horizontal scaling
✓ Time-ordered IDs
✓ 69-year lifetime
✓ 99.99% availabilityProduction Readiness
✓ Clock regression handling
✓ Sequence overflow management
✓ Worker ID coordination
✓ Comprehensive monitoring
✓ Graceful degradation
✓ Disaster recoveryScalability
✓ 10K IDs/sec per worker
✓ 1,024 workers globally
✓ 10M IDs/sec theoretical
✓ Geographic distribution
✓ Auto-scaling support
✓ Cost-effective ($0.01/M IDs)Practice Questions
Warm-up Questions
1. "Explain how Snowflake IDs work"
2. "What are the advantages over database sequences?"
3. "How do you ensure uniqueness?"
4. "What happens if the clock goes backwards?"Intermediate Questions
5. "How do you handle sequence overflow?"
6. "How do you assign worker IDs?"
7. "What's the maximum throughput?"
8. "How do you monitor the system?"Advanced Questions
9. "How do you migrate from auto-increment to Snowflake?"
10. "What if you need more than 1,024 workers?"
11. "How do you handle leap seconds?"
12. "What are the security considerations?"Final Checklist
Before ending the interview, ensure you've covered:
□ ID format and bit allocation
□ Uniqueness guarantee
□ Clock synchronization
□ Worker ID management
□ Sequence overflow handling
□ Scaling strategy
□ Monitoring and alerting
□ Edge cases (clock regression, etc.)
□ Alternative approaches
□ Tradeoffs and decisionsPost-Interview Reflection
Strong Performance Indicators:
✓ Clear explanation of ID format
✓ Handled edge cases proactively
✓ Discussed tradeoffs
✓ Showed production awareness
✓ Scaled the design appropriately
✓ Considered monitoring and operations
Areas for Improvement:
✗ Missed clock regression handling
✗ Didn't discuss worker ID management
✗ Ignored sequence overflow
✗ No monitoring discussion
✗ Didn't consider alternatives
✗ Weak on scaling strategyThis comprehensive interview guide prepares you to confidently discuss distributed unique ID generation systems, handle tough questions, and demonstrate both technical depth and production readiness.