Week 38 · Phase 5 — The Builds
The Minimax algorithm — the most elegant game AI ever invented, and an opponent that cannot lose.
Photo · Michal Vrba / Unsplash
Tic-tac-toe has 255,168 possible games. That sounds like a lot, but to a CPU it's ten milliseconds of work. The chef can quite literally think through every possible game, decide which move leads to the best outcome assuming both players play perfectly, and pick that move. The result is an AI that draws or wins every game it plays. There is no luck involved. There is no machine learning involved. There is just a function calling itself.
The algorithm is Minimax, invented in 1928 by John von Neumann. It still underlies every chess engine, every Go engine, every game AI that does forward search. AlphaGo's tree search is Minimax with a neural network bolted on. We're going to write the bare version today, in about 30 lines of C++.
Two players: Maximiser wants the score as high as possible, Minimiser wants it as low as possible. Each board position has a value from the maximiser's perspective.
+1.-1.0."What's the value of this position?" reduces to "what's the value of each child position?" — which reduces recursively until we hit a terminal state. The recursion is the algorithm.
// returns the best score this position leads to,
// assuming both players play optimally.
// player_to_move: +1 = Maximiser (X), -1 = Minimiser (O)
int minimax(Board& board, int player_to_move) {
// terminal cases
Cell w = board.winner();
if (w == Cell::X) return +1;
if (w == Cell::O) return -1;
if (board.full()) return 0;
// recursive case: try every move, score it, pick best/worst
int best = (player_to_move == +1) ? INT_MIN : INT_MAX;
for (int r = 0; r < 3; r++) {
for (int c = 0; c < 3; c++) {
if (board.at(r, c) != Cell::Empty) continue;
// try this move
board.place(r, c, player_to_move == +1 ? Cell::X : Cell::O);
int score = minimax(board, -player_to_move);
board.unplace(r, c); // undo
if (player_to_move == +1)
best = std::max(best, score);
else
best = std::min(best, score);
}
}
return best;
}
That's all of it. Maximiser tries each empty square, recursively scores the resulting position with Minimiser to move, and picks the move that gives the highest score. Minimiser does the symmetric thing — picks the move that gives the lowest score. Recursion bottoms out at terminal positions.
(You'll need a small unplace method on the Board to undo a move — just set the cell back to Empty. We do this so we don't have to copy the whole board every recursive call.)
Now in the AI's turn, instead of asking the user, run minimax for each empty square and pick the highest-scoring move:
struct Move { int r, c; };
Move best_move(Board& board, Cell player) {
int sign = (player == Cell::X) ? +1 : -1;
int best_score = -sign * INT_MAX;
Move best = {-1, -1};
for (int r = 0; r < 3; r++)
for (int c = 0; c < 3; c++) {
if (board.at(r, c) != Cell::Empty) continue;
board.place(r, c, player);
int score = minimax(board, -sign);
board.unplace(r, c);
if (sign * score > sign * best_score) {
best_score = score;
best = {r, c};
}
}
return best;
}
Plug this into your game loop where O's turn was, instead of cin >> r >> c. The AI will play perfectly. You'll never beat it. From the AI's first move, the game is determined: best play leads to a draw.
Minimax exhaustively searches the whole game tree. For tic-tac-toe, the tree has fewer than 250,000 leaves; modern computers explore that in milliseconds. The recursion bottoms out at terminal positions (clear answer); intermediate nodes propagate the right answer up.
The reason both players are assumed to play optimally: if the AI assumes its opponent plays optimally and the opponent doesn't, the AI's score will only get better. Pessimism is conservative.
Minimax explores every game from this position onwards. For tic-tac-toe (~250K games) it's fine. For checkers (~10²⁰ games) it's hopeless. For chess (~10⁴³⁶) it's beyond hopeless. The pure algorithm doesn't scale beyond simple games.
The fixes are layered:
For tic-tac-toe none of this matters. For chess all of it matters.
Minimax is the cleanest example of search over a tree of possibilities — a paradigm that runs through chess, Go, robot path-planning, theorem proving, and modern reinforcement learning. AlphaGo's MCTS is a sophisticated relative; AlphaZero's RL training is built on it; today's "agentic" LLMs that plan multi-step actions are conceptually doing tree search over possible action sequences.
You just wrote the great-grandfather of AlphaGo. In 30 lines of C++.
A perfect tic-tac-toe player and a world-class Go engine differ in degree, not in kind.
minimax and print how many calls happen per move. Notice how it shrinks as the board fills.Tic-tac-toe is small. Time to go bigger. Next week we start project 2: a Scrabble clone. The board is bigger, the rules are more complex, and the dictionary problem alone is a serious data-structures challenge.
Week 39 is Project 2: Scrabble Clone.
Photo free under the Unsplash license. Thinking · Michal Vrba.