Most graph work starts with two choices: how to store the graph and which traversal to run. An adjacency list stores, per node, its neighbors, using $O(V+E)$ space and making iteration over edges cheap; an adjacency matrix is a $V \times V$ table giving $O(1)$ edge lookups but $O(V^2)$ space. The list is the default because real graphs are sparse. Two tasks then recur, ordering a dependency graph and detecting cycles, and they are two sides of the same coin.
Topological sort
A topological order lists the nodes so every edge points forward, and it exists if and only if the graph is a directed acyclic graph. Kahn's algorithm computes it by repeatedly removing a node with in-degree zero (nothing depends on it yet) and decrementing its neighbors' in-degrees. The why-it-also-detects-cycles falls out for free: if at the end you have emitted fewer than $V$ nodes, the remainder form a cycle and no valid order exists. A depth-first alternative produces the order by reversing the post-order finish times.
Cycle detection
In a directed graph, color nodes white (unseen), gray (on the current recursion path), and black (finished); an edge to a gray node is a back edge, which is exactly a cycle, because gray means "an ancestor still being explored." In an undirected graph the test is simpler: any edge to an already-visited node that is not the immediate parent closes a cycle, and union-find gives an alternative that adds edges and reports the first one whose endpoints already share a set.
Where it shows up
Topological order is the backbone of build systems, package managers, course-prerequisite planning, and spreadsheet recalculation; cycle detection underlies deadlock detection and validating that a set of dependencies is even satisfiable. The same $O(V+E)$ traversal answers both.
Step by step
- Topological sort (Kahn): compute every node's in-degree.
- Queue all in-degree-zero nodes; pop one, append it, and decrement its neighbors.
- If fewer than V nodes are emitted, a cycle exists.
- Cycle detection (DFS): color nodes; a gray neighbor is a back edge, hence a cycle.
from collections import deque
def topological_sort(graph):
indeg = {u: 0 for u in graph}
for u in graph:
for v in graph[u]:
indeg[v] += 1
q = deque([u for u in graph if indeg[u] == 0])
order = []
while q:
u = q.popleft(); order.append(u)
for v in graph[u]:
indeg[v] -= 1
if indeg[v] == 0:
q.append(v)
return order if len(order) == len(graph) else None # None means a cycle
def has_cycle(graph):
WHITE, GRAY, BLACK = 0, 1, 2
color = {u: WHITE for u in graph}
def dfs(u):
color[u] = GRAY
for v in graph[u]:
if color[v] == GRAY:
return True
if color[v] == WHITE and dfs(v):
return True
color[u] = BLACK
return False
return any(dfs(u) for u in graph if color[u] == WHITE)
That topological_sort is Kahn's algorithm: the in-degree, BFS-style
method. The other standard way is a DFS that records each node after all its descendants (post-order)
and reverses the result. Both are O(V + E); pick whichever fits the code around it.
def topological_sort_dfs(graph):
visited, order = set(), []
def dfs(u):
visited.add(u)
for v in graph[u]:
if v not in visited:
dfs(v)
order.append(u) # post-order: record after all descendants
for u in graph:
if u not in visited:
dfs(u)
return order[::-1] # reverse post-order is a topological order
Complexity (time and space)
Both run in $O(V + E)$ time and $O(V)$ space on an adjacency list: Kahn touches each edge once when decrementing in-degrees, and the colored DFS visits each vertex and edge once. The space is the in-degree or color map plus the queue or recursion stack.
Worked example
The DAG orders cleanly, and the cycle test distinguishes it from a 3-node loop:
G = {0: [1, 2], 1: [3], 2: [3], 3: []}
print(topological_sort(G)) # [0, 1, 2, 3]
print(has_cycle(G)) # False
print(has_cycle({0: [1], 1: [2], 2: [0]})) # True
Follow-up questions
- When does a topological order exist? Exactly when the directed graph is acyclic; a cycle makes a consistent forward ordering impossible.
- Why does Kahn's algorithm detect cycles? Nodes inside a cycle never reach in-degree zero, so fewer than V nodes are emitted.
- Adjacency list vs matrix? List is $O(V+E)$ space and fast to iterate (sparse graphs); matrix is $O(V^2)$ but gives $O(1)$ edge checks (dense graphs).
- Directed vs undirected cycle detection? Directed uses the gray back-edge test on the recursion path; undirected uses a visited non-parent edge or union-find.
- What does DFS post-order give? Reversing the finish order yields a topological sort.
References
- Cormen, Leiserson, Rivest & Stein, Introduction to Algorithms (CLRS, 4th ed.).
- Kahn, Topological Sorting of Large Networks (CACM, 1962).
- Tarjan, Depth-First Search and Linear Graph Algorithms (1972).