Week 38 — Tic-Tac-Toe AI · From Zero to AI Hero

A young person concentrating during a chess game.

"What's the best move?" — by considering every possible move, every possible response, every possible response to that, all the way to the end.

Tic-tac-toe has 255,168 possible games. That sounds like a lot, but to a CPU it's ten milliseconds of work. The chef can quite literally think through every possible game, decide which move leads to the best outcome assuming both players play perfectly, and pick that move. The result is an AI that draws or wins every game it plays. There is no luck involved. There is no machine learning involved. There is just a function calling itself.

The algorithm is Minimax, invented in 1928 by John von Neumann. It still underlies every chess engine, every Go engine, every game AI that does forward search. AlphaGo's tree search is Minimax with a neural network bolted on. We're going to write the bare version today, in about 30 lines of C++.

The idea

Two players: Maximiser wants the score as high as possible, Minimiser wants it as low as possible. Each board position has a value from the maximiser's perspective.

If Maximiser has won: +1.
If Minimiser has won: -1.
If it's a draw: 0.
If the game isn't over: it's whoever's turn it is's job to choose. Maximiser picks the move that leads to the highest value among the options; Minimiser picks the lowest.

"What's the value of this position?" reduces to "what's the value of each child position?" — which reduces recursively until we hit a terminal state. The recursion is the algorithm.

The whole thing

// returns the best score this position leads to,
// assuming both players play optimally.
// player_to_move: +1 = Maximiser (X), -1 = Minimiser (O)
int minimax(Board& board, int player_to_move) {
    // terminal cases
    Cell w = board.winner();
    if (w == Cell::X) return +1;
    if (w == Cell::O) return -1;
    if (board.full()) return  0;

    // recursive case: try every move, score it, pick best/worst
    int best = (player_to_move == +1) ? INT_MIN : INT_MAX;
    for (int r = 0; r < 3; r++) {
        for (int c = 0; c < 3; c++) {
            if (board.at(r, c) != Cell::Empty) continue;

            // try this move
            board.place(r, c, player_to_move == +1 ? Cell::X : Cell::O);
            int score = minimax(board, -player_to_move);
            board.unplace(r, c);     // undo

            if (player_to_move == +1)
                best = std::max(best, score);
            else
                best = std::min(best, score);
        }
    }
    return best;
}

That's all of it. Maximiser tries each empty square, recursively scores the resulting position with Minimiser to move, and picks the move that gives the highest score. Minimiser does the symmetric thing — picks the move that gives the lowest score. Recursion bottoms out at terminal positions.

(You'll need a small unplace method on the Board to undo a move — just set the cell back to Empty. We do this so we don't have to copy the whole board every recursive call.)

Picking a move

Now in the AI's turn, instead of asking the user, run minimax for each empty square and pick the highest-scoring move:

struct Move { int r, c; };

Move best_move(Board& board, Cell player) {
    int sign = (player == Cell::X) ? +1 : -1;
    int best_score = -sign * INT_MAX;
    Move best = {-1, -1};

    for (int r = 0; r < 3; r++)
        for (int c = 0; c < 3; c++) {
            if (board.at(r, c) != Cell::Empty) continue;
            board.place(r, c, player);
            int score = minimax(board, -sign);
            board.unplace(r, c);
            if (sign * score > sign * best_score) {
                best_score = score;
                best = {r, c};
            }
        }
    return best;
}

Plug this into your game loop where O's turn was, instead of cin >> r >> c. The AI will play perfectly. You'll never beat it. From the AI's first move, the game is determined: best play leads to a draw.

Why it works

Minimax exhaustively searches the whole game tree. For tic-tac-toe, the tree has fewer than 250,000 leaves; modern computers explore that in milliseconds. The recursion bottoms out at terminal positions (clear answer); intermediate nodes propagate the right answer up.

The reason both players are assumed to play optimally: if the AI assumes its opponent plays optimally and the opponent doesn't, the AI's score will only get better. Pessimism is conservative.

The trouble — exponential blowup

Minimax explores every game from this position onwards. For tic-tac-toe (~250K games) it's fine. For checkers (~10²⁰ games) it's hopeless. For chess (~10⁴³⁶) it's beyond hopeless. The pure algorithm doesn't scale beyond simple games.

The fixes are layered:

Alpha-beta pruning (Week 43) — skip provably-irrelevant branches. Makes the search tree maybe ~10× smaller.
Limited depth — search only N moves ahead. Combined with a heuristic evaluation function that estimates how good a non-terminal position is.
Move ordering — try the best-looking moves first; alpha-beta then prunes more.
Transposition tables — remember positions you've seen before so you don't re-search them.
Monte-Carlo tree search + a neural network evaluator (this is roughly what AlphaGo does).

For tic-tac-toe none of this matters. For chess all of it matters.

Why this matters for AI

Minimax is the cleanest example of search over a tree of possibilities — a paradigm that runs through chess, Go, robot path-planning, theorem proving, and modern reinforcement learning. AlphaGo's MCTS is a sophisticated relative; AlphaZero's RL training is built on it; today's "agentic" LLMs that plan multi-step actions are conceptually doing tree search over possible action sequences.

You just wrote the great-grandfather of AlphaGo. In 30 lines of C++.

A perfect tic-tac-toe player and a world-class Go engine differ in degree, not in kind.

Try it yourself

Plug minimax into your tic-tac-toe game. Play it. Try to beat it. (You can't.)
Add a counter inside minimax and print how many calls happen per move. Notice how it shrinks as the board fills.
For fun, time the first move. The AI looks at all 250K games before placing. Should take ≪ 100 ms.

What's next

Tic-tac-toe is small. Time to go bigger. Next week we start project 2: a Scrabble clone. The board is bigger, the rules are more complex, and the dictionary problem alone is a serious data-structures challenge.

Week 39 is Project 2: Scrabble Clone.

Photo credit

Photo free under the Unsplash license. Thinking · Michal Vrba.