When you shard data across $N$ nodes, the naive mapping $\text{node} = \text{hash}(key) \bmod N$ has a fatal flaw: change $N$ and almost every key moves. Consistent hashing fixes that, so adding or removing a node relocates only about $1/N$ of the keys, which is the difference between a smooth scale-up and a cache-wide stampede.
The reason $\bmod N$ fails is arithmetic. A key keeps its node when $N$ grows from $k$ to $k+1$ only if $\text{hash}(key) \bmod k = \text{hash}(key) \bmod (k+1)$, which happens with probability roughly $1/(k+1)$. So nearly all keys are remapped, and a sharded cache effectively empties on every resize.
The ring and the 1/N result
Consistent hashing places both nodes and keys on a circular hash space, and a key is owned by the first node clockwise from its position. Adding a node only captures the arc between it and its predecessor, so the expected fraction of keys that move is about $1/(N+1)$; removing a node hands its arc to the next node, moving about $1/N$. Lookups are a binary search over the sorted node positions, $O(\log N)$.
Virtual nodes and load balance
With only a few real nodes, the random arcs are uneven and load is lopsided. Giving each physical node $V$ virtual positions scattered around the ring averages this out: the standard deviation of per-node load shrinks on the order of $O(1/\sqrt{V})$, so a hundred or more virtual nodes per server gives near-uniform load. The cost is memory and lookup time proportional to $V \times N$, the usual smoothness-versus-overhead trade.
Variants
Jump consistent hash (Lamping & Veach 2014) computes the bucket for a key in a few lines with no stored ring at all and near-perfect balance, at the cost of requiring sequentially numbered buckets, which suits sharded storage more than a churny set of cache servers. Bounded-load consistent hashing caps any node's share to avoid hotspots when keys are skewed.
Step by step
- Hash each node, plus several virtual copies, onto a ring of integers.
- Hash each key onto the same ring.
- A key is served by the first node clockwise (a binary search over sorted hashes).
- Adding a node steals only the arc before it; removing one hands its arc to the next.
import hashlib, bisect
class HashRing:
def __init__(self, nodes=(), vnodes=100):
self.vnodes = vnodes
self.ring = {} # ring position -> node
self.keys = [] # sorted ring positions
for n in nodes:
self.add(n)
def _hash(self, s):
return int(hashlib.md5(s.encode()).hexdigest(), 16)
def add(self, node):
for i in range(self.vnodes):
h = self._hash(f"{node}#{i}")
self.ring[h] = node
bisect.insort(self.keys, h)
def get(self, key):
if not self.ring:
return None
h = self._hash(key)
i = bisect.bisect(self.keys, h) % len(self.keys) # next node clockwise
return self.ring[self.keys[i]]
Complexity (time and space)
Lookup is $O(\log V')$ where $V' = V \times N$ is the number of ring positions; adding a node is $O(V \log V')$. Space is $O(V')$. More virtual nodes lowers load imbalance like $O(1/\sqrt{V})$ but raises memory and lookup cost.
Worked example
Hashing keys onto a three-node ring, then dropping a node, moves only about a third of the keys:
ring = HashRing(["A", "B", "C"], vnodes=100)
print(ring.get("apple")) # 'A'
print(ring.get("banana")) # 'B'
# Of 1000 keys, removing node C moves about 336 (~1/N); the rest stay put.
Follow-up questions
- Why does hash mod N remap almost everything? A key keeps its bucket only if its hash has the same remainder mod N and mod N+1, probability ~1/(N+1), so nearly all keys move.
- What fraction of keys move when a node leaves? About 1/N: only the keys owned by that node's arcs are reassigned to the next node clockwise.
- Why virtual nodes? Few real nodes give uneven arcs and skewed load; many virtual positions per node reduce load imbalance like O(1/sqrt(V)).
- When use jump consistent hash instead? When buckets can be numbered sequentially and you want zero storage and better balance, typical of sharded storage.
- How do you prevent a single node overloading on skewed keys? Bounded-load consistent hashing caps each node's share, redirecting overflow to the next node.
References
- Karger et al., Consistent Hashing and Random Trees (1997).
- Lamping & Veach, A Fast, Minimal Memory, Consistent Hash Algorithm (jump hash, 2014).
- Kleppmann, Designing Data-Intensive Applications (2017).