Networking Protocols: Complete Guide for System Design
Overview
Every system design interview involves services communicating over a network. Understanding protocols at a deep level lets you make informed decisions about latency, reliability, and scalability.
1. TCP vs UDP
TCP (Transmission Control Protocol)
How it works: Connection-oriented, reliable, ordered delivery.
Three-Way Handshake
Client Server
β β
βββββ SYN (seq=x) βββββββΊβ 1. Client initiates
β β
βββββ SYN-ACK ββββββββββββ 2. Server acknowledges + initiates
β (seq=y, ack=x+1) β
β β
βββββ ACK (ack=y+1) βββββΊβ 3. Client confirms
β β
β Connection Open βCost: 1.5 RTT before any data flows. With TLS, add another 1-2 RTT.
Flow Control (Sliding Window)
- Receiver advertises a window size (how much data it can buffer)
- Sender never sends more than the window allows
- Window shrinks as data arrives, grows as application reads it
- Prevents fast sender from overwhelming slow receiver
Congestion Control
| Algorithm | Behavior |
|---|---|
| Slow Start | Exponential growth until threshold |
| Congestion Avoidance | Linear growth after threshold |
| Fast Retransmit | Retransmit after 3 duplicate ACKs |
| Fast Recovery | Don't reset to slow start on fast retransmit |
| BBR (Google) | Model-based, measures bandwidth and RTT |
Key insight for interviews: TCP slow start means new connections are slow. This is why connection pooling and HTTP/2 multiplexing matter.
Connection Teardown (Four-Way)
Client Server
βββββ FIN ββββββββββββββββΊβ
βββββ ACK βββββββββββββββββ
βββββ FIN βββββββββββββββββ
βββββ ACK ββββββββββββββββΊβ
β β
β TIME_WAIT (2ΓMSL) β Client waits ~60s before port reuseTIME_WAIT problem: High-throughput servers can exhaust ephemeral ports. Solutions: SO_REUSEADDR, connection pooling, or switch to long-lived connections.
UDP (User Datagram Protocol)
How it works: Connectionless, unreliable, unordered. Just sends packets.
- No handshake (0 RTT to start sending)
- No guaranteed delivery or ordering
- No flow/congestion control (application must handle)
- Smaller header (8 bytes vs TCP's 20+ bytes)
When to Use Which
| Use Case | Protocol | Why |
|---|---|---|
| Web APIs | TCP | Need reliability, ordering |
| Video streaming | UDP | Tolerate loss, need low latency |
| Gaming | UDP | Real-time, stale data useless |
| DNS queries | UDP | Small payload, speed matters |
| File transfer | TCP | Must have complete, ordered data |
| VoIP | UDP | Real-time, retransmission too slow |
| IoT telemetry | UDP | Lightweight, high volume |
Interview tip: "We'd use TCP here because we need guaranteed delivery of financial transactions" or "UDP for live video because a retransmitted frame arrives too late to be useful."
2. HTTP/1.1 vs HTTP/2 vs HTTP/3
HTTP/1.1 (1997)
Key characteristics:
- Text-based protocol
- One request per TCP connection at a time (head-of-line blocking)
- Workaround: browsers open 6-8 parallel connections per domain
- Keep-Alive reuses connections but still sequential
Head-of-Line (HOL) Blocking:
Connection 1: [Request A]ββββββββ[Response A]ββββ[Request C]ββββ[Response C]
Connection 2: [Request B]ββ[Response B]ββββββββββ[idle]βββββββββββββββββββββIf Response A is slow, Request C waits even though the server could serve it.
HTTP/2 (2015)
Key improvements:
- Binary framing layer (more efficient parsing)
- Multiplexing: Multiple streams over single TCP connection
- Header compression (HPACK): Reduces redundant headers by 85-90%
- Server push: Server sends resources before client requests them
- Stream prioritization: Client hints which resources matter most
Single TCP Connection:
βββββββββββββββββββββββββββββββββββββββββββββββ
β Stream 1: [Headers][Data][Data] β
β Stream 2: [Headers][Data] β
β Stream 3: [Headers][Data][Data][Data] β
β (interleaved frames on the wire) β
βββββββββββββββββββββββββββββββββββββββββββββββRemaining problem: TCP-level HOL blocking. If a TCP packet is lost, ALL streams wait for retransmission, even unaffected ones.
Performance numbers:
- 50-70% reduction in page load time for asset-heavy pages
- Single connection vs 6-8 connections reduces server memory
- Header compression saves 10-30KB per page load
HTTP/3 (2022) β QUIC
Key innovation: Runs over UDP instead of TCP, implements its own reliability.
ββββββββββββββββββββββββββββββββββββββββ
β HTTP/3 (Application) β
ββββββββββββββββββββββββββββββββββββββββ€
β QUIC (Transport) β
β β’ Stream multiplexing β
β β’ Per-stream flow control β
β β’ Connection migration β
β β’ 0-RTT connection establishment β
ββββββββββββββββββββββββββββββββββββββββ€
β UDP (Network) β
ββββββββββββββββββββββββββββββββββββββββAdvantages over HTTP/2:
- No HOL blocking: Lost packet only affects its stream
- 0-RTT resumption: Returning clients send data immediately
- Connection migration: Survives IP changes (WiFi β cellular)
- Built-in encryption: TLS 1.3 integrated into handshake
Connection establishment comparison:
HTTP/1.1 + TLS 1.2: 3 RTT (TCP + TLS + Request)
HTTP/2 + TLS 1.3: 2 RTT (TCP + TLS/Request combined)
HTTP/3 (new): 1 RTT (QUIC handshake includes crypto)
HTTP/3 (resumption): 0 RTT (send data with first packet)Comparison Table
| Feature | HTTP/1.1 | HTTP/2 | HTTP/3 |
|---|---|---|---|
| Transport | TCP | TCP | QUIC (UDP) |
| Multiplexing | No | Yes | Yes |
| HOL Blocking | Application + TCP | TCP only | None |
| Header Compression | None | HPACK | QPACK |
| Connection Setup | 2-3 RTT | 2 RTT | 0-1 RTT |
| Connection Migration | No | No | Yes |
| Encryption | Optional | Effectively required | Mandatory |
3. WebSocket vs SSE vs Long Polling
Long Polling
Client Server
βββ GET /updates ββββββββββββΊβ
β β (holds connection open)
β β ... waits for data ...
ββββ 200 OK + data ββββββββββ (responds when data available)
β β
βββ GET /updates ββββββββββββΊβ (immediately reconnects)
β β ... waits again ...Characteristics:
- Compatible with all infrastructure (proxies, load balancers)
- Each response requires a new HTTP request
- Timeout handling needed (30-60s typical)
- Server holds connections open (resource intensive)
- ~100ms latency per message (reconnection overhead)
Server-Sent Events (SSE)
Client Server
βββ GET /stream βββββββββββββΊβ
β Accept: text/event-streamβ
β β
ββββ HTTP 200 βββββββββββββββ
β Content-Type: β
β text/event-stream β
β β
ββββ data: message 1\n\n ββββ (server pushes)
ββββ data: message 2\n\n ββββ (server pushes)
ββββ data: message 3\n\n ββββ (server pushes)
β ... βCharacteristics:
- Unidirectional (server β client only)
- Built-in reconnection with
Last-Event-ID - Text-based (UTF-8 only)
- Works over standard HTTP (proxy-friendly)
- Automatic reconnection by browser
- Limited to ~6 connections per domain in HTTP/1.1
WebSocket
Client Server
βββ HTTP Upgrade Request ββββΊβ
β Upgrade: websocket β
β Connection: Upgrade β
β β
ββββ 101 Switching ββββββββββ
β Protocols β
β β
βββββΊ Full-duplex binary ββββΊβ (bidirectional frames)
βββββΊ communication ββββββΊβCharacteristics:
- Full-duplex (both directions simultaneously)
- Binary and text frames
- Low overhead per message (2-14 bytes framing vs HTTP headers)
- Persistent connection
- Requires WebSocket-aware load balancers
- No built-in reconnection (application must handle)
Decision Framework
| Requirement | Best Choice | Why |
|---|---|---|
| Real-time chat | WebSocket | Bidirectional, low latency |
| Live sports scores | SSE | Server-push only, auto-reconnect |
| Stock ticker | WebSocket | High frequency, bidirectional |
| News feed updates | SSE | Infrequent server pushes |
| Collaborative editing | WebSocket | Bidirectional, binary data |
| Simple notifications | SSE | One-way, simple implementation |
| Legacy system support | Long Polling | Works everywhere |
| IoT device commands | WebSocket | Bidirectional, persistent |
Interview tip: "For a notification system, SSE is simpler and sufficient since we only push from server to client. WebSocket adds complexity we don't need."
4. DNS Resolution Flow
Complete Resolution Process
User types "www.example.com"
β
βΌ
βββββββββββββββββββ Cache hit?
β Browser Cache ββββββββββββββββββΊ Done (TTL-based)
ββββββββββ¬βββββββββ
β Cache miss
βΌ
βββββββββββββββββββ Cache hit?
β OS Cache ββββββββββββββββββΊ Done
β (stub resolver)β
ββββββββββ¬βββββββββ
β Cache miss
βΌ
βββββββββββββββββββ Cache hit?
β Recursive DNS ββββββββββββββββββΊ Done
β (ISP/8.8.8.8) β
ββββββββββ¬βββββββββ
β Cache miss (iterative queries begin)
βΌ
βββββββββββββββββββ
β Root Server ββββΊ "Ask .com TLD server at 192.5.6.30"
β (13 clusters) β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β TLD Server ββββΊ "Ask example.com NS at 205.251.192.1"
β (.com, .org) β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββββββ
β Authoritative Server ββββΊ "www.example.com = 93.184.216.34"
β (example.com) β
ββββββββββββββββββββββββRecord Types
| Type | Purpose | Example |
|---|---|---|
| A | IPv4 address | example.com β 93.184.216.34 |
| AAAA | IPv6 address | example.com β 2606:2800:220:1:... |
| CNAME | Alias to another name | www β example.com |
| MX | Mail server | example.com β mail.example.com |
| NS | Nameserver delegation | example.com β ns1.example.com |
| TXT | Arbitrary text | SPF, DKIM, domain verification |
| SRV | Service location | _http._tcp.example.com |
TTL and Caching Strategy
- Short TTL (60-300s): Enables fast failover, more DNS traffic
- Long TTL (3600-86400s): Reduces DNS load, slower failover
- TTL=0: No caching (used during migrations)
System design implications:
- DNS-based load balancing uses short TTLs for health-check responsiveness
- CDN providers use 60s TTL for quick origin switching
- During migrations, lower TTL days in advance, then switch
DNS in System Design
- Global load balancing: GeoDNS routes users to nearest datacenter
- Service discovery: Internal DNS for microservice endpoints
- Failover: Health-checked DNS removes unhealthy endpoints
- Blue-green deployments: Switch DNS to new environment
5. TLS Handshake
TLS 1.2 Handshake (2 RTT)
Client Server
β β
βββ ClientHello ββββββββββββββββββββΊβ Supported ciphers, random
β β
ββββ ServerHello βββββββββββββββββββ Chosen cipher, random
ββββ Certificate βββββββββββββββββββ Server's X.509 cert
ββββ ServerKeyExchange βββββββββββββ DH parameters
ββββ ServerHelloDone βββββββββββββββ
β β
βββ ClientKeyExchange ββββββββββββββΊβ Client's DH public key
βββ ChangeCipherSpec βββββββββββββββΊβ "Switching to encrypted"
βββ Finished βββββββββββββββββββββββΊβ Encrypted verification
β β
ββββ ChangeCipherSpec ββββββββββββββ
ββββ Finished ββββββββββββββββββββββ
β β
βββββΊ Encrypted Application Data βββΊβTLS 1.3 Handshake (1 RTT)
Client Server
β β
βββ ClientHello ββββββββββββββββββββΊβ + key_share (DH public key)
β + supported_versions β + signature_algorithms
β + key_share β
β β
ββββ ServerHello βββββββββββββββββββ + key_share
ββββ EncryptedExtensions βββββββββββ (encrypted from here)
ββββ Certificate βββββββββββββββββββ
ββββ CertificateVerify ββββββββββββ
ββββ Finished ββββββββββββββββββββββ
β β
βββ Finished βββββββββββββββββββββββΊβ
β β
βββββΊ Encrypted Application Data βββΊβKey improvements in TLS 1.3:
- 1 RTT handshake (vs 2 RTT in 1.2)
- 0-RTT resumption (send data with first message)
- Removed insecure algorithms (RSA key exchange, CBC, RC4, SHA-1)
- Forward secrecy mandatory (ephemeral DH only)
- Encrypted more of the handshake (hides certificate from observers)
Certificate Chain Verification
βββββββββββββββββββββββ
β Root CA (trusted) β Pre-installed in OS/browser
β Self-signed β ~150 root CAs trusted globally
ββββββββββββ¬βββββββββββ
β signs
βΌ
βββββββββββββββββββββββ
β Intermediate CA β Issued by Root CA
β β Used for day-to-day signing
ββββββββββββ¬βββββββββββ
β signs
βΌ
βββββββββββββββββββββββ
β Server Certificate β Your domain's certificate
β (leaf cert) β Contains public key + domain name
βββββββββββββββββββββββALPN (Application-Layer Protocol Negotiation)
- Negotiates application protocol during TLS handshake
- Client sends list: ["h2", "http/1.1"]
- Server picks: "h2"
- Avoids extra round trip for protocol upgrade
- Essential for HTTP/2 and HTTP/3 negotiation
6. gRPC and Protocol Buffers
Protocol Buffers (Protobuf)
Schema definition:
syntax = "proto3";
message User {
string id = 1; // field number, not value
string name = 2;
string email = 3;
repeated string roles = 4;
google.protobuf.Timestamp created_at = 5;
}
message GetUserRequest {
string user_id = 1;
}
service UserService {
rpc GetUser(GetUserRequest) returns (User);
rpc ListUsers(ListUsersRequest) returns (stream User);
rpc CreateUser(User) returns (User);
}Binary encoding advantages:
- 3-10x smaller than JSON
- 20-100x faster serialization/deserialization
- Schema evolution with backward/forward compatibility
- Strongly typed (catches errors at compile time)
gRPC Communication Patterns
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β gRPC Modes β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. Unary RPC (request-response) β
β Client ββ[Request]βββΊ Server ββ[Response]βββΊ β
β β
β 2. Server Streaming β
β Client ββ[Request]βββΊ Server ββ[R1][R2][R3]βββΊ β
β β
β 3. Client Streaming β
β Client ββ[R1][R2][R3]βββΊ Server ββ[Response]βββΊ β
β β
β 4. Bidirectional Streaming β
β Client βββ[messages]βββΊ Server β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββgRPC vs REST Comparison
| Aspect | gRPC | REST |
|---|---|---|
| Protocol | HTTP/2 | HTTP/1.1 or HTTP/2 |
| Payload | Protobuf (binary) | JSON (text) |
| Contract | .proto file (strict) | OpenAPI (optional) |
| Streaming | Native (4 modes) | Limited (SSE, WebSocket) |
| Browser support | Via grpc-web proxy | Native |
| Code generation | Built-in | Third-party tools |
| Latency | Lower (binary + HTTP/2) | Higher (text + overhead) |
| Debugging | Harder (binary) | Easier (human-readable) |
| Load balancing | L7 required (HTTP/2) | L4 or L7 |
When to Use gRPC
Use gRPC for:
- Internal microservice communication (performance critical)
- Polyglot environments (code gen for 10+ languages)
- Streaming data (real-time feeds, event streams)
- Mobile clients with bandwidth constraints
Use REST for:
- Public APIs (browser compatibility, developer familiarity)
- Simple CRUD operations
- When human readability matters for debugging
- Third-party integrations
gRPC Performance Characteristics
Benchmark: 1000 requests, 1KB payload
REST/JSON:
Serialization: ~500ΞΌs
Payload size: ~1,200 bytes
Total latency: ~2ms
gRPC/Protobuf:
Serialization: ~50ΞΌs
Payload size: ~400 bytes
Total latency: ~0.5msInterview tip: "For service-to-service communication within our backend, gRPC gives us type safety, streaming, and 3-5x better performance. For our public API, REST is more accessible to third-party developers."
Quick Reference: Protocol Selection
Need reliable delivery?
βββ Yes β TCP-based
β βββ Request-response? β HTTP (REST or gRPC)
β βββ Server push only? β SSE
β βββ Bidirectional real-time? β WebSocket
β βββ High-performance internal? β gRPC
βββ No β UDP-based
βββ Real-time media? β RTP/WebRTC
βββ Fast queries? β DNS, QUIC
βββ IoT/lightweight? β MQTT over UDP, CoAPInterview Cheat Sheet
| When interviewer asks... | Key points to mention |
|---|---|
| "How do services communicate?" | gRPC for internal, REST for external, async via message queues |
| "How to handle real-time updates?" | WebSocket for bidirectional, SSE for server-push, consider scale |
| "Why is the first request slow?" | TCP handshake + TLS handshake + DNS resolution = cold start |
| "How to reduce latency?" | Connection pooling, HTTP/2 multiplexing, 0-RTT with TLS 1.3 |
| "How does HTTPS work?" | TLS handshake, certificate verification, symmetric key exchange |
| "HTTP/2 vs HTTP/3?" | HTTP/3 eliminates TCP HOL blocking, enables connection migration |