Week 10 — The Translator · From Zero to AI Hero

A close-up of a dictionary page with definitions in black text.

The chef does not speak C. Or Python. Or anything you'll ever write. Something has to translate.

The chef from Phase 1 only understands one language: machine code. Strings of bytes that decode into ADD, MOV, JMP, and friends. Nothing else. Not Python, not JavaScript, not C. The chef will never read your source code. Ever.

So between you and the chef, there is always a translator. The choice of which translator — and when the translation happens — is one of the deepest design decisions in any programming language. It's the reason Python is forgiving and slow, the reason C is unforgiving and fast, the reason JavaScript got a hundred times faster between 2008 and 2015 without changing its syntax, and the reason every single AI inference engine ends up spitting out machine code at the bottom.

Two main strategies

There are two ways to bridge the gap between human source code and the chef:

Translate it all up front. Read every line. Spend serious time analysing it. Produce one big block of machine code. Hand it to the chef. From then on, run that block as fast as the chef can. This is compilation. C, Rust, Go, Swift, C++ — all compiled.
Translate one line at a time, while running. Read a line, figure out what to do, do it. Read the next line, figure out what to do, do it. Continue forever. This is interpretation. Python, Ruby, classic JavaScript, classic Lua, shell scripts — all interpreted.

The first is like producing the full English translation of a French novel and printing it as a book — slow up front, but every reader thereafter just reads English. The second is like having a translator stand next to you reading aloud, line by line, every time you want to read the book. Convenient if you only read it once. Murderous if you read it a million times.

A vintage printing press machine in action.

Photo · Bank Phrom / Unsplash

A compiler is a printing press. The pages take time to set, but every copy after the first comes out fast.

What a compiler actually does

"Compile" sounds like one verb. It is, technically, four:

The compiler:

Tokenises and parses your source — the curly braces, the keywords, the names. Builds an Abstract Syntax Tree (AST) of what the program means.
Type-checks the program. In C this catches "you said this was an int but treated it as a string". In safer languages it catches a lot more. Things that fail here never make it to the chef.
Optimises. Loops are unrolled. Useless code is deleted. Repeated work is hoisted. Function calls are inlined. The compiler will aggressively rewrite your program into something equivalent but faster — often by orders of magnitude. This is the part that makes "compiled" languages fast.
Emits machine code for the chef's specific instruction set, links it together with libraries, and writes out an executable file you can run.

This whole pipeline takes seconds for a small program, minutes for Linux, an hour for a modern web browser. But once it's done, the executable can run as many times as you like, at full chef-speed. Every C program you ever ship runs through these four steps before it meets the chef.

What an interpreter does

An interpreter skips most of that. It reads your Python file, builds a quick representation, and starts executing it line by line — looking up types at runtime, dispatching operators at runtime, allocating memory at runtime. It is friendly and forgiving (you don't have to wait for "the build" — you just run), but it's also paying a steep tax on every single line.

The price: when you write x + y in Python, the interpreter has to look up what x is, look up what y is, find the right "plus" function for those types, call it, store the result, and continue. In compiled C, the same line is one machine instruction. Ten clock cycles in Python becomes one in C, and that's the optimistic case.

For interactive work — a notebook, a quick script, a teaching environment — interpretation is the better trade. For tight inner loops, it's miserable.

The relative speeds

It's hard to give universally fair numbers — workloads vary, optimisations matter, modern interpreters are clever. But for arithmetic-heavy code, the order of magnitude is roughly this, with C as the baseline:

1× (baseline)

Rust

~1.0×

C++

~1.0×

~1.5×

Swift

~1.5×

Java (JIT)

~2–3×

JavaScript (V8)

~2–4×

Ruby

~30×

Python (CPython)

~50–100×

Read the bottom row again. The same loop in vanilla Python is 50 to 100 times slower than the same loop in C. This is real and observable. It's why every Python data-science library is, secretly, a thin Python wrapper around a fast C or C++ engine.

JIT — having both at once

For a long time the choice was binary: compile, ship, accept the workflow pain — or interpret, accept the speed hit. Then somebody had a clever idea.

Run the program through an interpreter. Watch which functions get called a lot. After they've been hit a few hundred times, take that hot function aside, compile it to machine code in the background, and from then on use the compiled version. Cold code stays interpreted (fast to start). Hot code becomes compiled (fast to run). You get most of compilation's speed and most of interpretation's flexibility.

This is Just-In-Time compilation, or JIT. JavaScript got 100× faster between 2008 and 2015 mostly because Google built V8, an aggressive JIT for JavaScript. Java has had a JIT since the late '90s. .NET has one. Python's "PyPy" is a JIT for Python (and is several times faster than CPython on heavy loops). Modern AI runtimes — including the inner loops of PyTorch's compiled mode — use JIT-style techniques to specialise for the actual shapes of tensors at runtime.

The translator stops being one strategy and becomes a continuum: pure interpretation on one end, pure compilation on the other, and everything in between is a JIT trying to figure out which lines are worth compiling early.

An industrial robotic arm in a factory in monochrome.

Photo · Possessed Photography / Unsplash

A modern compiler pipeline really is a factory: source goes in, optimisation passes happen, and machine-code rolls off the line.

Why C is fast

Because the language was deliberately designed to not get in the way of the compiler. Every C type has a known size at compile time. Every C function has a known signature. Pointer arithmetic is allowed. There's no garbage collector watching over you. There's no runtime type checking. The compiler can reason about your program with extreme precision, generate excellent assembly, and the resulting executable speaks the chef's language directly with no translation layer at runtime.

This is also why C is dangerous. The same lack of guard rails that lets the compiler optimise your code aggressively also lets you walk straight off a cliff. Every serious memory safety bug ever filed against an OS kernel — buffer overflows, dangling pointers, use-after-free — comes from C and its derivatives. We pay for the speed.

Modern alternatives like Rust try to keep C-level speed while making the worst categories of bugs literally impossible to express. We'll come back to that argument in Phase 7.

Why this matters for AI

AI training and inference live or die on the inner-loop speed of a few specific operations: large matrix multiplications, convolutions, attention. Those operations are written in C++/CUDA, compiled aggressively, and called from Python only once per "step". The Python for loop never sees a single floating-point number.

This is the trick of modern numerical Python. Your code looks like x = matrix @ vector. That single line dispatches into a precompiled C++ kernel that runs at full chef-speed for milliseconds. Then you get one slow Python step, then another precompiled kernel, and so on. Most of your running time is in the kernels — and the kernels are, by careful design, compiled ahead of time.

If you ever look at why a numerical Python program is slow, the answer is almost always: too much Python between the kernels. The kernels themselves are already as fast as they're going to get.

"Compiled" and "interpreted" are not opposites. They are the two ends of one dial — and modern systems are constantly turning it.

Try it yourself

Watch a compiler at work, end-to-end:

Save this as hello.c:

#include <stdio.h>
int main(void) {
    printf("hello, world\n");
    return 0;
}

Compile and run: cc hello.c -o hello && ./hello. Two seconds, then a binary on disk.
See the assembly the compiler generated: cc -S -O2 hello.c. Open hello.s. That is what the chef will actually see — a few dozen lines of ARM or x86.
Now save the same idea as hello.py with one print("hello, world"). Run python3 hello.py. No build step, no binary — but every time you run it, Python re-parses, re-builds, re-interprets.
For fun: time them both. time ./hello vs time python3 hello.py. Even on this trivial program, Python's startup takes longer than C's entire execution.

What's next

You now know there's a translator between you and the chef, and roughly how it works. Time to actually write something for the translator to translate.

Week 11 is Hello World — the most-written program in history, dissected line by line. By the end of next week you'll know exactly what every word in #include <stdio.h> means, and why main is special.

Photo credits

All photos are free under the Unsplash license. Dictionary · Mick Haupt · Press · Bank Phrom · Factory · Possessed Photography. Pipeline and speed-bars are inline SVG / CSS.