Week 36 — Trees & Graphs · From Zero to AI Hero

An art installation showing many spheres connected by lines, suggesting a network.

Once you can see graphs, you can see them everywhere. File systems, the web, the brain, neural networks — same shape underneath.

Up to now: rows, chains, stacks, hash tables. All of them have a fixed connection pattern. The most general data structure has no fixed pattern at all: just nodes that point to other nodes, in any topology you like. We call it a graph. When the graph has no cycles and a single root, we call it a tree. When the graph is undirected, finite, and connected, we still call it a graph. The two names refer to the same idea at different shapes.

Once you can see graphs, you see them everywhere. The web is a graph (pages linking to pages). Your filesystem is a tree. Your social network is a graph. The dependency tree of a build system is a tree. The transformer's computation graph is a graph. Phase 7's whole world is, structurally, graphs.

The vocabulary

Node (also vertex): one of the things being connected.
Edge: a connection between two nodes.
Directed: edges have a direction (A → B is not the same as B → A). The web is directed.
Undirected: edges go both ways. Friendship in a social network (mostly).
Weighted: each edge has a number — distance, cost, probability.
Tree: a connected graph with no cycles. Each child has exactly one parent.
Binary tree: each node has at most two children.
DAG (directed acyclic graph): directed, no cycles. Build dependency graphs, autograd graphs.

Two ways to represent

Adjacency list

Each node has a list of its neighbours. The standard.

std::vector<std::vector<int>> graph(N);
graph[0].push_back(1);      // edge 0 -> 1
graph[0].push_back(2);      // edge 0 -> 2
graph[1].push_back(3);      // edge 1 -> 3

// to visit all neighbours of node n:
for (int m : graph[n]) { visit(m); }

Memory: O(V + E), where V is the number of vertices and E the edges. Best for sparse graphs (most graphs are sparse).

Adjacency matrix

An N×N grid; matrix[i][j] = 1 if there's an edge from i to j.

Memory: O(V²). Faster lookup ("is there an edge?") in O(1). Used for very dense graphs and for some algorithms (Floyd-Warshall) that need quick edge queries.

The two algorithms you must know

Depth-First Search (DFS)

Walk as deep as you can, backtrack when stuck. Implemented with recursion (or an explicit stack):

void dfs(int n, std::vector<bool>& visited) {
    if (visited[n]) return;
    visited[n] = true;
    visit(n);
    for (int m : graph[n]) dfs(m, visited);
}

Used for: cycle detection, topological sort, finding connected components, solving mazes.

Breadth-First Search (BFS)

Explore by distance from the source. Uses a queue:

void bfs(int start) {
    std::queue<int> q;
    std::vector<bool> visited(graph.size(), false);
    q.push(start); visited[start] = true;
    while (!q.empty()) {
        int n = q.front(); q.pop();
        visit(n);
        for (int m : graph[n]) {
            if (!visited[m]) {
                visited[m] = true;
                q.push(m);
            }
        }
    }
}

Used for: shortest path on unweighted graphs, level-order tree traversal, web crawling. Dijkstra's algorithm is BFS with a priority queue instead of a regular queue, generalising to weighted graphs.

Trees you'll meet

Binary search tree — sorted; lookup/insert in O(log n) when balanced. std::map is one (red-black variant).
Heap — a tree where parent > children. Used by priority queues (Week 35).
Trie — a tree of characters; great for autocomplete and dictionary-based games (Scrabble — Week 39!).
Quad-tree / oct-tree — spatial subdivision for 2D/3D collision detection and rendering.
B-tree — wide branching, designed for disk pages. Every database index is one of these.

Why this matters for AI

The autograd "computation graph" inside PyTorch is a directed acyclic graph: each tensor operation is a node, the inputs are edges. To compute gradients, you walk the graph backwards (a topological sort + DFS). Without graphs as a concept, modern deep learning is unimaginable.

Game-tree search — Minimax, Alpha-Beta — is a tree traversal. Phase 5's chess AI lives entirely in that algorithmic family.

Knowledge graphs, retrieval-augmented generation, the relationship between attention heads — all graphs. After this week, every "X is connected to Y" relationship in the AI literature should look like a familiar shape.

Phase 4 is done. You now have the data structures and the analysis tools. The rest of the course is about putting them to work.

Try it yourself

Build a small adjacency-list graph of cities and roads. Implement BFS to find the shortest path (in number of hops) between two cities.
Implement a binary search tree. Add a few values. Walk it in-order; confirm the values come out sorted.
Skim the Wikipedia page for "graph data structure" and notice how many specific algorithms exist for graph problems. There's a whole field.

What's next — and welcome to Phase 5

You now have the language (C/C++), the abstractions (OO), and the toolbox (data structures and algorithms). Time to actually build something.

Phase 5 — The Builds — is three games, each ending with an AI opponent. Tic-Tac-Toe → Scrabble → Chess. By the end of Phase 5 you'll have written 5,000 lines of working C++ that plays games — and the AI techniques you'll learn there are more or less the same ones that won AlphaGo.

Week 37 is Project 1 — Tic-Tac-Toe.

Photo credit

Photo free under the Unsplash license. Network · Alina Grubnyak.