Sorting: merge sort and quicksort

Algorithms · foundations · Apr 2024

Merge sort and quicksort are the two comparison sorts worth knowing in detail. Both are divide and conquer, but they divide the work differently and make opposite trade-offs, which is exactly why interviews and real systems keep both around.

Merge sort

Merge sort splits the array in half, sorts each half recursively, and merges the two sorted halves with a linear two-pointer pass. Its recurrence is $T(n) = 2T(n/2) + O(n)$, which the master theorem solves to $\Theta(n\log n)$ in all cases, worst included. It is stable (equal elements keep their order) and needs $O(n)$ scratch space, and because it accesses data sequentially it is the algorithm of choice for linked lists and external sorts that stream from disk.

Quicksort

Quicksort picks a pivot, partitions the array so smaller elements precede it and larger ones follow, then recurses on each side. It sorts in place and, with good pivots, runs in $\Theta(n\log n)$ on average with small constants and excellent cache behavior, which makes it the fastest general sort in practice. Its weakness is a worst case of $\Theta(n^2)$, hit when pivots are consistently extreme (for example always choosing the last element on already-sorted input); randomizing the pivot or using median-of-three makes that worst case astronomically unlikely.

The n log n lower bound

No comparison sort can beat $\Omega(n\log n)$, and the argument is clean. Any comparison sort is a decision tree whose leaves are the possible orderings; there are $n!$ permutations, and a binary tree of height $h$ has at most $2^h$ leaves, so $2^h \ge n!$. Taking logs and using Stirling's approximation, $h \ge \log_2 n! = \Theta(n\log n)$. Non-comparison sorts like counting and radix sort beat this only because they do not compare; they exploit structure in the keys.

Stability and what runs in practice

A stable sort preserves the relative order of equal keys, which matters when sorting by one field after another (sort by name, then stably by department). Most standard libraries use an adaptive hybrid: Python's Timsort merges runs and switches to insertion sort on small or nearly-sorted segments, getting $O(n)$ on already-ordered data while keeping the $O(n\log n)$ guarantee.

Step by step

  1. Merge sort: split in half, recursively sort each half, merge the two sorted halves.
  2. Merge: walk both halves with two pointers, always taking the smaller front element.
  3. Quicksort: choose a pivot and partition so smaller elements come before it.
  4. Recurse on the left and right partitions around the pivot's final position.
def merge(left, right):
    out, i, j = [], 0, 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:           # <= keeps the sort stable
            out.append(left[i]); i += 1
        else:
            out.append(right[j]); j += 1
    out.extend(left[i:]); out.extend(right[j:])
    return out

def merge_sort(a):
    if len(a) <= 1:
        return a
    mid = len(a) // 2
    return merge(merge_sort(a[:mid]), merge_sort(a[mid:]))

def partition(a, lo, hi):
    pivot, i = a[hi], lo
    for j in range(lo, hi):
        if a[j] < pivot:
            a[i], a[j] = a[j], a[i]; i += 1
    a[i], a[hi] = a[hi], a[i]
    return i

def quicksort(a, lo=0, hi=None):
    if hi is None:
        hi = len(a) - 1
    if lo >= hi:
        return a
    p = partition(a, lo, hi)
    quicksort(a, lo, p - 1)
    quicksort(a, p + 1, hi)
    return a

Pivot selection and the Hoare scheme

Quicksort's $O(n^2)$ worst case comes entirely from bad pivots, so production quicksorts never use a fixed pivot like the last element. The cheap, standard fix is a random pivot, which makes the worst case astronomically unlikely (expected $O(n\log n)$ on any input); a common refinement is median-of-three, the median of the first, middle, and last elements, which sidesteps the already-sorted trap. For a worst-case guarantee there is median of medians: split the array into groups of five, take each group's median, then recursively take the median of those medians as the pivot. That pivot provably falls in the middle 30 to 70 percent of the values, giving worst-case $O(n\log n)$ sorting and worst-case $O(n)$ selection via quickselect.

The partition scheme matters too. The Lomuto scheme above is simple but does many swaps; the original Hoare partition walks two pointers inward and swaps far less, which is why real implementations prefer it. It returns a split point rather than the pivot's final index, so the recursion covers [lo, p] and [p+1, hi]:

import random

def partition_hoare(a, lo, hi):
    pivot = a[random.randint(lo, hi)]        # random pivot avoids the O(n^2) worst case
    i, j = lo - 1, hi + 1
    while True:
        i += 1
        while a[i] < pivot:                  # scan up to an element not below the pivot
            i += 1
        j -= 1
        while a[j] > pivot:                  # scan down to an element not above the pivot
            j -= 1
        if i >= j:
            return j                         # the split point
        a[i], a[j] = a[j], a[i]

def quicksort(a, lo=0, hi=None):
    if hi is None:
        hi = len(a) - 1
    if lo < hi:
        p = partition_hoare(a, lo, hi)
        quicksort(a, lo, p)                  # Hoare returns j: recurse [lo, p] and [p+1, hi]
        quicksort(a, p + 1, hi)
    return a

Complexity (time and space)

Merge sort is $\Theta(n\log n)$ time in every case, $O(n)$ extra space, and stable. Quicksort is $\Theta(n\log n)$ average, $\Theta(n^2)$ worst (mitigated by random or median pivots), $O(\log n)$ stack space, in place, and not stable. The $\Omega(n\log n)$ lower bound means neither can be beaten asymptotically by any comparison sort.

Worked example

Both produce the same sorted output; merge sort is stable, quicksort is in place:

print(merge_sort([5, 2, 9, 1, 5, 6]))   # [1, 2, 5, 5, 6, 9]
print(quicksort([5, 2, 9, 1, 5, 6]))    # [1, 2, 5, 5, 6, 9]

Follow-up questions

  • Why can no comparison sort beat n log n? There are $n!$ orderings and a comparison decision tree of height $h$ has $\le 2^h$ leaves, so $2^h \ge n!$ forces $h = \Theta(n\log n)$.
  • What is quicksort's worst case and how is it avoided? $O(n^2)$ on consistently extreme pivots; randomizing the pivot or median-of-three makes it unlikely.
  • What does stable mean and when does it matter? Equal keys keep their input order, which matters when sorting by multiple keys in sequence.
  • Merge sort vs quicksort, which when? Merge sort for stability, linked lists, or external data; quicksort for in-place speed on arrays.
  • Why is quicksort often fastest in practice despite the worst case? In-place partitioning is cache-friendly and has small constants, and the worst case is avoided with randomized pivots.

References

  1. Cormen, Leiserson, Rivest & Stein, Introduction to Algorithms (CLRS, 4th ed.).
  2. Hoare, Quicksort (Computer Journal, 1962).
  3. Sedgewick & Wayne, Algorithms (4th ed.).
  4. Knuth, The Art of Computer Programming, Vol. 3.