Breadth-first search (BFS)

Breadth-first search visits nodes in order of distance from the start, one full level before the next, using a first-in-first-out queue. Because it always reaches nearer nodes before farther ones, it returns the shortest path in an unweighted graph for free. Structurally it is identical to depth-first search with a single change: a queue instead of a stack.

Why a queue gives shortest paths

The queue holds nodes in nondecreasing order of distance: at any moment it contains some nodes at distance $d$ followed by some at $d+1$, never anything farther. So when a node is first discovered, it is reached through a node already at minimal distance, which makes its recorded distance minimal too. This level-order property is exactly what a stack would destroy, since a stack would dive deep before finishing a level, so it is the queue that makes BFS a shortest-path algorithm.

Distances and parent pointers

Record a distance and a parent the first time each node is seen, and never overwrite them, since the first sighting is already optimal. To recover the actual path, follow parents back from the goal to the start and reverse. Marking a node as seen at discovery (when it is enqueued), not when it is dequeued, is what prevents the same node from being queued twice.

When BFS is the right tool

Use BFS for shortest paths and fewest-steps problems on unweighted graphs and grids, for level-order tree traversal, and to find the nearest node satisfying a condition. Seeding the queue with several starts at once gives multi-source BFS, which finds each node's distance to its closest source in one pass. For weighted graphs BFS no longer gives shortest paths and you need Dijkstra, though for weights that are only 0 or 1 a double-ended queue (0-1 BFS) recovers linear time.

Variants

Unweighted shortest path: the base case, where first discovery is the fewest-edges path.
Level-order traversal: process the queue one full level at a time, for trees or distance layers.
Multi-source BFS: seed the queue with many starts at distance 0 to get each node's distance to its nearest source in one pass.
0-1 BFS: a double-ended queue (push-front for 0-weight edges, push-back for 1-weight) gives shortest paths with 0/1 weights in O(V+E).
Bidirectional BFS: search forward from the start and backward from the goal until the frontiers meet, roughly square-rooting the nodes explored.
Kahn's topological sort: a BFS over in-degree-zero nodes (see graph theory).
Dijkstra and A*: BFS generalized to weighted graphs by swapping the queue for a priority queue, plus a heuristic for A* (see Dijkstra).
Grid / flood BFS: BFS over a grid for shortest paths or region growing.

Step by step

Put the start in a queue and record its distance as 0.
Pop the front node; if it is the goal, stop.
For each unseen neighbor, record distance and parent and enqueue it.
Follow parents back from the goal to reconstruct the path.

from collections import deque

def bfs_shortest(graph, start, goal):
    q = deque([start])
    dist = {start: 0}
    parent = {start: None}
    while q:
        u = q.popleft()
        if u == goal:
            break
        for v in graph[u]:
            if v not in dist:           # first sighting is the shortest
                dist[v] = dist[u] + 1
                parent[v] = u
                q.append(v)
    if goal not in dist:
        return None
    path, cur = [], goal
    while cur is not None:
        path.append(cur); cur = parent[cur]
    return path[::-1]

Complexity (time and space)

$O(V + E)$ time, since each vertex is enqueued once and each edge examined once, and $O(V)$ space for the queue, distance, and parent maps. It gives correct shortest paths only when every edge has equal weight; otherwise the equal-distance assumption fails and Dijkstra is required.

Worked example

On $0 \to \{1,2\}$, $1 \to 3$, $2 \to 3$, BFS reaches 3 at distance 2 along the first discovered shortest path:

G = {0: [1, 2], 1: [3], 2: [3], 3: []}
print(bfs_shortest(G, 0, 3))   # [0, 1, 3]

Follow-up questions

Why does BFS give shortest paths only when unweighted? The queue orders nodes by hop count, so first discovery is minimal in edges; with unequal weights fewer hops need not mean lower cost.
Why mark nodes seen at enqueue, not dequeue? Marking at discovery prevents the same node from being added to the queue multiple times, keeping the traversal linear.
What is multi-source BFS? Seed the queue with several starts at distance 0; each node then gets its distance to the nearest source in one pass.
BFS vs DFS, the only real difference? The frontier data structure: a queue (BFS, level order) versus a stack (DFS, depth first).
When would you use 0-1 BFS? When edge weights are only 0 or 1: a deque with front/back inserts keeps it linear instead of needing a heap.

References

Cormen, Leiserson, Rivest & Stein, Introduction to Algorithms (CLRS, 4th ed.).
Moore, The Shortest Path Through a Maze (1959).
Sedgewick & Wayne, Algorithms (4th ed.).