🔄 Consistent Hashing
Distributed key assignment with minimal remapping
Overview
Consistent hashing maps keys to nodes on a virtual ring so that adding or removing a node only remaps ~K/N keys (K = total keys, N = nodes), unlike modular hashing which remaps almost everything. This makes it essential for distributed caches (Memcached), load balancers, and distributed databases (DynamoDB, Cassandra).
Key Concepts
- Hash Ring: Nodes and keys are hashed onto a circular space [0, MAX_INT). A key is assigned to the first node found clockwise from its position.
- Virtual Nodes (vnodes): Each physical node gets multiple positions on the ring for better load balance. 150 vnodes/node gives ~5% std deviation.
- Minimal Disruption: Adding a node remaps only ~1/N of keys. Removing a node remaps only the keys it owned.
API
| Function | Description |
|---|---|
create ~vnodes:150 () | Empty ring with given vnode count |
add_node t "server-A" | Add a physical node |
remove_node t "server-A" | Remove a node |
lookup t "my-key" | Find owning node for a key |
distribution t keys | Key count per node |
balance_score t keys | Std deviation of distribution |
remapping_impact t keys node | Simulate adding a node |
Example Output
Consistent Hash Ring: 3 nodes, 150 vnodes/node, 450 ring points
Nodes: server-A, server-B, server-C
Key assignments:
user:1001 -> server-B
session:abc -> server-A
cache:homepage -> server-C
Distribution of 10,000 keys across 3 nodes:
server-A : 3412 keys (34.1%)
server-B : 3287 keys (32.9%)
server-C : 3301 keys (33.0%)
Std deviation: 53.3
Adding node 'server-D': 2489/10000 keys remapped (24.9%)
Complexity
| Operation | Time |
|---|---|
| lookup | O(V·N) where V=vnodes, N=nodes |
| add_node | O(V·N) — insert V points into sorted list |
| remove_node | O(V·N) — filter ring |
For production use, replace the sorted list with a balanced BST for O(log(V·N)) lookup.
Real-World Uses
- Amazon DynamoDB — partition key routing
- Apache Cassandra — token ring partitioning
- Memcached — client-side consistent hashing
- Akamai CDN — originally invented for this (Karger et al., 1997)