Consistent hashing

Software & architecture · architecture · Sep 2023

When you shard data across $N$ nodes, the naive mapping $\text{node} = \text{hash}(key) \bmod N$ has a fatal flaw: change $N$ and almost every key moves. Consistent hashing fixes that, so adding or removing a node relocates only about $1/N$ of the keys, which is the difference between a smooth scale-up and a cache-wide stampede.

The reason $\bmod N$ fails is arithmetic. A key keeps its node when $N$ grows from $k$ to $k+1$ only if $\text{hash}(key) \bmod k = \text{hash}(key) \bmod (k+1)$, which happens with probability roughly $1/(k+1)$. So nearly all keys are remapped, and a sharded cache effectively empties on every resize.

The ring and the 1/N result

Consistent hashing places both nodes and keys on a circular hash space, and a key is owned by the first node clockwise from its position. Adding a node only captures the arc between it and its predecessor, so the expected fraction of keys that move is about $1/(N+1)$; removing a node hands its arc to the next node, moving about $1/N$. Lookups are a binary search over the sorted node positions, $O(\log N)$.

Virtual nodes and load balance

With only a few real nodes, the random arcs are uneven and load is lopsided. Giving each physical node $V$ virtual positions scattered around the ring averages this out: the standard deviation of per-node load shrinks on the order of $O(1/\sqrt{V})$, so a hundred or more virtual nodes per server gives near-uniform load. The cost is memory and lookup time proportional to $V \times N$, the usual smoothness-versus-overhead trade.

Variants

Jump consistent hash (Lamping & Veach 2014) computes the bucket for a key in a few lines with no stored ring at all and near-perfect balance, at the cost of requiring sequentially numbered buckets, which suits sharded storage more than a churny set of cache servers. Bounded-load consistent hashing caps any node's share to avoid hotspots when keys are skewed.

Step by step

  1. Hash each node, plus several virtual copies, onto a ring of integers.
  2. Hash each key onto the same ring.
  3. A key is served by the first node clockwise (a binary search over sorted hashes).
  4. Adding a node steals only the arc before it; removing one hands its arc to the next.
import hashlib, bisect

class HashRing:
    def __init__(self, nodes=(), vnodes=100):
        self.vnodes = vnodes
        self.ring = {}          # ring position -> node
        self.keys = []          # sorted ring positions
        for n in nodes:
            self.add(n)

    def _hash(self, s):
        return int(hashlib.md5(s.encode()).hexdigest(), 16)

    def add(self, node):
        for i in range(self.vnodes):
            h = self._hash(f"{node}#{i}")
            self.ring[h] = node
            bisect.insort(self.keys, h)

    def get(self, key):
        if not self.ring:
            return None
        h = self._hash(key)
        i = bisect.bisect(self.keys, h) % len(self.keys)   # next node clockwise
        return self.ring[self.keys[i]]

Complexity (time and space)

Lookup is $O(\log V')$ where $V' = V \times N$ is the number of ring positions; adding a node is $O(V \log V')$. Space is $O(V')$. More virtual nodes lowers load imbalance like $O(1/\sqrt{V})$ but raises memory and lookup cost.

Worked example

Hashing keys onto a three-node ring, then dropping a node, moves only about a third of the keys:

ring = HashRing(["A", "B", "C"], vnodes=100)
print(ring.get("apple"))    # 'A'
print(ring.get("banana"))   # 'B'
# Of 1000 keys, removing node C moves about 336 (~1/N); the rest stay put.

Follow-up questions

  • Why does hash mod N remap almost everything? A key keeps its bucket only if its hash has the same remainder mod N and mod N+1, probability ~1/(N+1), so nearly all keys move.
  • What fraction of keys move when a node leaves? About 1/N: only the keys owned by that node's arcs are reassigned to the next node clockwise.
  • Why virtual nodes? Few real nodes give uneven arcs and skewed load; many virtual positions per node reduce load imbalance like O(1/sqrt(V)).
  • When use jump consistent hash instead? When buckets can be numbered sequentially and you want zero storage and better balance, typical of sharded storage.
  • How do you prevent a single node overloading on skewed keys? Bounded-load consistent hashing caps each node's share, redirecting overflow to the next node.

References

  1. Karger et al., Consistent Hashing and Random Trees (1997).
  2. Lamping & Veach, A Fast, Minimal Memory, Consistent Hash Algorithm (jump hash, 2014).
  3. Kleppmann, Designing Data-Intensive Applications (2017).