Your program has been running for 20 minutes. Is it stuck in an infinite loop, or is it just thinking really hard? Turns out, no algorithm in the universe can answer that for you.
In Part 1, we matched patterns with no memory. In Part 2, we added a stack and parsed nested structures. In Part 3, we got a bounded tape and checked cross-references. Now we remove all restrictions. Infinite tape. Unlimited time. Full power. And we discover that even with all of that, some problems are impossible.
The Turing Machine
In 1936, Alan Turing imagined the simplest possible general-purpose computer. It has:
- An infinite tape divided into cells, each holding a symbol
- A head that reads and writes one cell at a time
- A finite set of states with transition rules
- The ability to move the head left or right
That’s it. No RAM, no GPU, no cloud. And yet this simple machine can compute anything that any computer can compute. Your laptop, your Kubernetes cluster, your entire AWS region, they are all just very fancy Turing machines.
A transition rule looks like this:
\[\delta(q, a) = (q', b, D)\]And it states simply: “If you’re in state $q$ reading symbol $a$, write $b$, move direction $D$ (left or right), and go to state $q’$.”
Here is the key difference from everything we have seen before:
| Machine | Tape | Direction | Halts? |
|---|---|---|---|
| Finite Automaton | Read-only | Right only | Always |
| Pushdown Automaton | Input + stack | Right only | Always |
| Linear Bounded Automaton | Read/write | Both | Always (bounded) |
| Turing Machine | Read/write | Both | Maybe never :D |
That “maybe never” is not a bug. It is the defining feature. A Turing machine might run forever, and there is no general way to predict whether it will.
Unrestricted Grammars (Type 0)
At the top of the Chomsky hierarchy sit unrestricted grammars. Their production rules have no restrictions at all:
\[\alpha \to \beta\]Where $\alpha$ is any non-empty string containing at least one non-terminal, and $\beta$ is any string (including the empty string $\varepsilon$).
Compare this to the lower levels:
| Type | Rule Restriction | Can shrink strings? |
|---|---|---|
| Type 3 | $A \to aB$ or $A \to a$ | No |
| Type 2 | $A \to \gamma$ (single non-terminal left) | No (except $S \to \varepsilon$) |
| Type 1 | $|\alpha| \leq |\beta|$ | No |
| Type 0 | None | Yes |
The ability to shrink strings is what makes Type 0 grammars fundamentally different. A derivation can grow, shrink, grow again, shrink again… There is no way to bound how long a derivation might take, or whether it will ever finish.
This is exactly why Turing machines can loop forever: they are the machine equivalent of unrestricted grammars.
Let’s Build a Turing Machine
Here is a Java Turing machine simulator. We will use it to recognize the language ${a^n b^n c^n : n \geq 1}$, the same language from Part 3, but now with an unbounded tape.
import java.util.*;
public class TuringMachine {
enum Direction { LEFT, RIGHT }
record Transition(String nextState, char write, Direction move) {}
private final Map<String, Map<Character, Transition>> rules = new HashMap<>();
private final Set<String> acceptStates;
private final String startState;
public TuringMachine(String startState, Set<String> acceptStates) {
this.startState = startState;
this.acceptStates = acceptStates;
}
public void addRule(String state, char read,
String nextState, char write, Direction dir) {
rules.computeIfAbsent(state, k -> new HashMap<>())
.put(read, new Transition(nextState, write, dir));
}
public boolean run(String input) {
// Tape: use a list so it can grow in both directions
List<Character> tape = new ArrayList<>();
for (char c : input.toCharArray()) tape.add(c);
tape.add('_'); // blank symbol at the end
int head = 0;
String state = startState;
int steps = 0;
int maxSteps = 10_000; // safety net (we are not solving the halting problem today)
while (steps++ < maxSteps) {
if (acceptStates.contains(state)) return true;
char current = (head >= 0 && head < tape.size()) ? tape.get(head) : '_';
var stateRules = rules.get(state);
if (stateRules == null || !stateRules.containsKey(current)) {
return false; // no transition i.e., we reject
}
Transition t = stateRules.get(current);
// Write
while (head >= tape.size()) tape.add('_');
if (head < 0) { tape.add(0, '_'); head = 0; }
tape.set(head, t.write);
// Move
head += (t.move == Direction.RIGHT) ? 1 : -1;
// Transition
state = t.nextState;
}
return false; // timed out — maybe it loops forever, who knows?
}
/**
* Builds a TM that accepts a^n b^n c^n (n >= 1).
*
* Strategy:
* 1. Replace one 'a' with 'X', scan right to find a 'b', replace with 'Y'
* 2. Continue right to find a 'c', replace with 'Z'
* 3. Rewind to the leftmost unmarked 'a', repeat
* 4. When no 'a' remains, verify no 'b' or 'c' remains either
*/
public static TuringMachine buildAnBnCn() {
TuringMachine tm = new TuringMachine("q0", Set.of("accept"));
// q0: find an 'a' and mark it
tm.addRule("q0", 'a', "q1", 'X', Direction.RIGHT);
tm.addRule("q0", 'Y', "q4", 'Y', Direction.RIGHT); // no more a's
// q1: skip a's and Y's, find a 'b'
tm.addRule("q1", 'a', "q1", 'a', Direction.RIGHT);
tm.addRule("q1", 'Y', "q1", 'Y', Direction.RIGHT);
tm.addRule("q1", 'b', "q2", 'Y', Direction.RIGHT);
// q2: skip b's and Z's, find a 'c'
tm.addRule("q2", 'b', "q2", 'b', Direction.RIGHT);
tm.addRule("q2", 'Z', "q2", 'Z', Direction.RIGHT);
tm.addRule("q2", 'c', "q3", 'Z', Direction.LEFT);
// q3: rewind to the beginning
tm.addRule("q3", 'a', "q3", 'a', Direction.LEFT);
tm.addRule("q3", 'b', "q3", 'b', Direction.LEFT);
tm.addRule("q3", 'Y', "q3", 'Y', Direction.LEFT);
tm.addRule("q3", 'Z', "q3", 'Z', Direction.LEFT);
tm.addRule("q3", 'X', "q0", 'X', Direction.RIGHT);
// q4: verify no unmarked b's or c's remain
tm.addRule("q4", 'Y', "q4", 'Y', Direction.RIGHT);
tm.addRule("q4", 'Z', "q4", 'Z', Direction.RIGHT);
tm.addRule("q4", '_', "accept", '_', Direction.RIGHT);
return tm;
}
public static void main(String[] args) {
TuringMachine tm = buildAnBnCn();
System.out.println(tm.run("abc")); // true
System.out.println(tm.run("aabbcc")); // true
System.out.println(tm.run("aaabbbccc")); // true
System.out.println(tm.run("aabbc")); // false
System.out.println(tm.run("abcabc")); // false
System.out.println(tm.run("ab")); // false
}
}
Compare this with the LBA version from Part 3. The algorithm is essentially the same (mark and scan), but now we are not restricted to the input length. The tape can grow. That extra freedom is what separates decidable from undecidable.
The Church-Turing Thesis
Before we break everything, let’s appreciate how powerful Turing machines are.
The Church-Turing Thesis (1936) states:
Any function that can be computed by an “effective procedure” can be computed by a Turing machine.
This is not a theorem (i.e., you can not prove it) but a definition of what “computable” means. Every programming language you have ever used (Java, Python, Haskell, C++, C or even Brainfuck) is Turing-complete. They can all compute exactly the same things.
So when we say “a problem is unsolvable by a Turing machine,” we mean it is unsolvable by any computer, in any language, with any amount of time and memory.
That is… a strong statement.
The Halting Problem
Here is the question that haunts every programmer:
My program has been running for 10 minutes. Will it ever finish, or should I kill it?
Let’s make this precise.
The Halting Problem: Given a program P and an input x, does P eventually stop, or does it run forever?
Some programs are obvious:
// Obviously halts
void greet() {
System.out.println("Hello!");
}
// Obviously loops forever
void spin() {
while (true) {}
}
But what about this one?
// Does this halt for every positive integer n?
void collatz(long n) {
while (n != 1) {
n = (n % 2 == 0) ? n / 2 : 3 * n + 1;
}
}
Nobody knows. Mathematicians have checked every number up to $2^{68}$ and it always reaches 1, but nobody has proven it. If you could write a program that decides halting for all programs, you would solve this conjecture (and thousands of others) instantly.
Theorem (Turing, 1936): No such program exists. No algorithm can decide halting for all programs.
The Proof (Step by Step)
Don’t worry, this is simpler than it looks. The core idea is the liar’s paradox: “This statement is false.” If it is true, then it is false. If it is false, then it is true. Boom, contradiction. Turing applied the same trick to programs.
Step 1: Assume we have a magic oracle.
Imagine someone hands you a function halts(P, x) that always gives the correct answer:
// Fantasy function: always correctly answers "does P halt on x?"
boolean halts(Program p, Input x) {
// ... some genius algorithm ...
// returns true → P will halt on x
// returns false → P will loop forever on x
}
Step 2: Build a troublemaker.
Using this oracle, we construct a new program called Troublemaker:
void troublemaker() {
if (halts(troublemaker, noInput)) {
// Oracle says "troublemaker halts" → so we loop forever
while (true) {}
} else {
// Oracle says "troublemaker loops" → so we halt
return;
}
}
Read it carefully. Troublemaker asks the oracle about itself, and then does the opposite of whatever the oracle predicts.
Step 3: Watch it explode.
Now we ask: what happens when troublemaker() runs?
| Oracle says… | Troublemaker does… | Actual result | Oracle was… |
|---|---|---|---|
| “troublemaker will halt” | Enters while(true){} |
Loops forever | Wrong |
| “troublemaker will loop” | Calls return |
Halts | Wrong |
No matter what the oracle answers, it is wrong. The oracle contradicts itself. Therefore the oracle can not exist. $\square$
Why the Trick Works
The magic is in self-reference. Troublemaker feeds itself to the oracle and does the opposite.
It is the computational equivalent of telling a fortune-teller: “I will do the opposite of whatever you predict.”
No prediction can be correct, because you defined your behavior to contradict it.
Troublemaker program do?This is not a hack or a technicality. It is a fundamental limit. No computer, no matter how powerful, can solve the halting problem for all programs.
Why This Matters to You (Practically)
“Cool proof, but I write REST APIs, not halting oracles.”
Fair enough. But the halting problem is not just academic. It sets hard limits on what your tools can do. Every time you are frustrated that your IDE, linter, or CI pipeline is not smarter, there is a good chance the halting problem is why.
1. Dead Code Detection
Your IDE sometimes grays out code and says “unreachable statement.” But it can not catch everything:
public void process() {
if (collatzConjectureIsTrue()) {
doWork();
}
cleanup(); // Is this dead code? Nobody knows.
}
/**
* The Collatz conjecture: start with any positive integer.
* If even, divide by 2. If odd, multiply by 3 and add 1.
* Conjecture: you always reach 1.
* Unproven since 1937. Your IDE won't solve it either.
*/
boolean collatzConjectureIsTrue() {
for (long n = 1; n < Long.MAX_VALUE; n++) {
long x = n;
while (x != 1) {
x = (x % 2 == 0) ? x / 2 : 3 * x + 1;
if (x > Long.MAX_VALUE / 4) return false; // overflow guard
}
}
return true;
}
Determining whether cleanup() is reachable requires knowing whether collatzConjectureIsTrue() can return false.
That is equivalent to solving an open mathematical conjecture.
Your IDE wisely gives up.
2. Perfect Optimization Is Impossible
Could a compiler look at your program and produce the shortest equivalent program? No.
If such a “perfect optimizer” existed, you could use it to solve the halting problem:
given program P, optimize the program “run P; print 1”.
If P halts, the shortest equivalent is just “print 1”.
If P loops, the program cannot be simplified.
So a perfect optimizer would tell you whether P halts. Contradiction.
This is why compiler optimizations are heuristics.
GCC has -O0 through -O3, not -Operfect.
3. Complete Static Analysis Is Impossible
Can a tool analyze your code and find all bugs with zero false positives? By Rice’s theorem (coming up next), no.
Real tools make trade-offs:
| Tool Philosophy | False Positives | False Negatives | Example |
|---|---|---|---|
| Sound (find all) | Many | None | Formal verifier |
| Complete (no false) | None | Many | Your compiler |
| Practical | Some | Some | SpotBugs, ESLint |
Every bug-finding tool lives somewhere on this spectrum, and none can occupy the perfect corner.
4. Your CI Pipeline Has Timeouts for a Reason
Ever set a timeout on a build step?
steps:
- name: Run tests
run: mvn test
timeout-minutes: 30
This is not just about saving CI minutes. It is an engineering acknowledgment that you cannot know in advance whether a process will terminate. The timeout is a practical workaround for the halting problem.
5. Kubernetes Liveness Probes
Why does Kubernetes need liveness probes?
livenessProbe:
httpGet:
path: /health
port: 8080
periodSeconds: 10
failureThreshold: 3
Because Kubernetes cannot look at your application and determine “is this thing still doing useful work, or is it stuck?” That is the halting problem in disguise. So instead of trying to solve it analytically, Kubernetes polls (i.e., “are you still alive? ping me back”). If three pings fail, it kills the pod and restarts.
This pattern (give up on predicting, just observe and react) is the standard engineering response to undecidability.
6. Antivirus and Malware Detection
Can you build a perfect virus detector that identifies all malware and never flags clean programs? No. A virus is essentially “a program that does bad things,” and determining what a program does for all possible inputs is… the halting problem again.
This is why antivirus software uses signatures (pattern matching, Type 3!) and heuristics (behavioral analysis) rather than perfect semantic analysis. And it is why zero-day exploits exist: the scanner has never seen this pattern before, and it can not reason from first principles about what the code will do.
Rice’s Theorem: The Generalization
The halting problem says you can not decide if a program halts. Rice’s Theorem says it’s much worse than that.
Rice’s Theorem: Any non-trivial semantic property of programs is undecidable.
A property is “non-trivial” if some programs have it and some do not. Examples:
| Property | Trivial? | Decidable? |
|---|---|---|
| “P is written in Java” | — | Yes (syntax, not semantics) |
| “P has more than 100 lines” | — | Yes (syntax) |
“P outputs 42 on input 7” |
No | No |
| “P always terminates” | No | No |
| “P never throws an exception” | No | No |
| “P is equivalent to program Q” | No | No |
| “P computes a total function” | No | No |
The first two are syntactic properties (you can check them by reading the source code). Everything else is a semantic property (it depends on the program’s behavior), and Rice’s theorem says those are all undecidable.
So when someone asks “can we build a tool that checks whether this function ever returns null?” the answer is: in general, no. You can build tools that catch many cases, but you can not catch all cases without false positives.
Decidable vs. Semi-Decidable vs. Undecidable
Let’s be precise about degrees of unsolvability:
Decidable (recursive): An algorithm exists that always says YES or NO in finite time.
- “Is this string valid JSON?” — Yes. Parse it.
- “Does this regex match this string?” — Yes. Simulate the DFA.
Semi-decidable (recursively enumerable): An algorithm exists that says YES in finite time if the answer is yes, but might run forever if the answer is no.
- “Does this program halt on this input?” — Just run it. If it halts, you know. If it doesn’t… you wait. Forever.
- “Does this first-order logic formula have a proof?” — Enumerate all proofs. If one exists, you’ll find it.
Undecidable (not even semi-decidable): No algorithm can even reliably say YES.
- “Does this program loop forever on this input?” — The complement of the halting problem. You can not even confirm “yes, it loops.”
- “Are these two programs equivalent?” — Neither yes nor no is confirmable in general.
Here is a diagram that puts this in perspective:
┌─────────────────────────────────────────────────────────────┐
│ ALL PROBLEMS │
│ │
│ ┌───────────────────────────────────────────────────┐ │
│ │ SEMI-DECIDABLE (r.e.) │ │
│ │ │ │
│ │ ┌──────────────────────────────────┐ │ │
│ │ │ DECIDABLE │ │ │
│ │ │ │ │ │
│ │ │ • "Is this valid JSON?" │ │ │
│ │ │ • "Is n prime?" │ │ │
│ │ │ • Regex matching │ │ │
│ │ │ • Context-sensitive membership │ │ │
│ │ └──────────────────────────────────┘ │ │
│ │ │ │
│ │ • "Does P halt on x?" (can confirm YES) │ │
│ │ • "Is this formula provable?" │ │
│ └───────────────────────────────────────────────────┘ │
│ │
│ • "Does P loop forever on x?" (not even semi-decidable) │
│ • "Are P and Q equivalent?" │
│ • Most problems (almost everything, really) │
└─────────────────────────────────────────────────────────────┘
The sobering truth: the set of decidable problems is a tiny island in an ocean of undecidability. Almost all problems are unsolvable. We just happen to care mostly about the solvable ones (lucky us).
Reductions: How We Prove Things Impossible
How do we actually prove that something is undecidable? We use reductions: if you could solve problem B, then you could solve problem A. And since A is already known to be undecidable, B must be undecidable too.
It’s like saying: “If you could predict the stock market, you could predict the weather.” Since predicting the weather (perfectly, long-term) is impossible, so is predicting the stock market.
Example: Program Equivalence
Claim: It is undecidable whether two programs compute the same function.
Proof by reduction from the Halting Problem:
Given a program $P$ and input $x$ (a halting problem instance), construct two programs:
// Program A: always returns 0
int programA(int input) {
return 0;
}
// Program B: runs P(x), then returns 0
int programB(int input) {
runProgram(P, x); // simulate P on x
return 0;
}
- If $P$ halts on $x$:
programBalways returns 0 → A and B are equivalent - If $P$ loops on $x$:
programBnever returns → A and B are not equivalent
So deciding “are A and B equivalent?” decides “does P halt on x?” Since the halting problem is undecidable, program equivalence must be too. $\square$
This reduction pattern shows up everywhere:
- “Does this refactored code behave the same as the original?” — Undecidable in general.
- “Is this migration backward-compatible?” — Undecidable in general.
- “Does this patch fix the bug without breaking anything else?” — Undecidable in general.
This is why we write tests. We can not prove equivalence, so we sample it.
The Practical Engineer’s Response to Undecidability
Undecidability sounds grim, but engineers have developed practical responses:
1. Approximate (Sound or Complete, Pick One)
Since perfect analysis is impossible, tools choose what to sacrifice:
┌──────────────────────────────┐
│ Perfect (impossible) │
│ all bugs, zero false alarms│
└──────────────┬───────────────┘
│
┌──────────────┴──────────────┐
│ │
┌─────────┴──────────┐ ┌────────────┴───────────┐
│ Sound (find all) │ │ Complete (no false +) │
│ trade-off: false + │ │ trade-off: misses bugs │
│ e.g., type checker │ │ e.g., compiler errors │
└────────────────────┘ └────────────────────────┘
Java’s type system is sound: if it compiles, certain errors cannot happen at runtime (no ClassCastException from generics, modulo raw types).
The cost? Occasionally fighting the type checker when you “know” it is safe.
2. Restrict the Problem Domain
Instead of solving the general problem, restrict what programs can do:
- Total functional languages (Agda, Idris): all programs must provably terminate. Halting problem solved! But you lose Turing-completeness.
- SQL (the standard): deliberately not Turing-complete. Every query terminates. (Until someone adds recursive CTEs :D)
- Regular expressions (without backreferences): guaranteed $O(n)$ matching. No halting problem.
This is the most underappreciated trick in software engineering. You do not need Turing-completeness for most tasks. Config files, query languages, template engines (i.e., tools that should always terminate) benefit from being deliberately less powerful.
3. Observe Instead of Predict
When you can not decide whether a program will fail, just watch it and react:
| Pattern | What it solves |
|---|---|
| Timeouts | “Will this request complete?” |
| Circuit breakers | “Is this service healthy?” |
| Liveness probes | “Is this process stuck?” |
| Watchdog timers | “Is the system responsive?” |
| Chaos engineering | “Will this survive failure?” |
All of these are empirical responses to the theoretical impossibility of prediction.
4. Test Instead of Prove
Since you can not prove programs correct in general, you sample their behavior:
// We can't prove this is correct for ALL inputs.
// But we can check a LOT of them.
@Property
void sortingIsIdempotent(@ForAll List<Integer> list) {
List<Integer> sorted = sort(list);
assertEquals(sorted, sort(sorted));
}
Property-based testing (QuickCheck, jqwik, Hypothesis) generates random inputs and checks invariants. It is not a proof, but it is the best practical approximation we have.
The Chomsky Hierarchy: Complete
We have now climbed the entire hierarchy:
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░ Type 0: Recursively Enumerable ░
░ (Turing Machine) ░ <-- YOU ARE HERE
░ ░
░ "Some of us may not halt, and that's ░
░ a feature, not a bug" ░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░ Type 1: Context-Sensitive ░ DONE
░ (Linear Bounded Automaton) ░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░ Type 2: Context-Free ░ DONE
░ (Pushdown Automaton) ░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
│
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░ Type 3: Regular ░ DONE
░ (Finite Automaton) ░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
The final summary:
| Type | Class | Machine | Can decide membership? | Real-World Vibe |
|---|---|---|---|---|
| Type 3 | Regular | DFA/NFA | Yes, $O(n)$ | “I live in the moment” |
| Type 2 | Context-Free | PDA | Yes, $O(n^3)$ | “I can handle nesting” |
| Type 1 | Context-Sensitive | LBA | Yes, but PSPACE | “Everything depends on everything” |
| Type 0 | Rec. Enumerable | TM | Semi-decidable only | “Hope I halt before the heat death of the universe” |
What’s Beyond the Hierarchy?
Even among undecidable problems, there are degrees of undecidability.
The halting problem is undecidable. But what about “does this Turing machine halt on ALL inputs?” That is even harder, it can not be solved even with a halting oracle. And there is a whole infinite hierarchy of increasingly unsolvable problems (the arithmetic hierarchy).
Level 0: Decidable problems (the easy stuff)
│
Level 1: Halting problem (undecidable but semi-decidable)
│
Level 2: Totality problem (is this program total?)
│
Level 3: ...even harder...
│
...
│
∞
But honestly? For software engineering, the first level of undecidability is enough to explain almost every limitation of our tools.
Summing Up: What This Series Taught Us
We climbed the entire Chomsky hierarchy and here is what we learned:
- Regular languages: Your input validation is fine. Regex works. Use it. But know its limits.
- Context-free languages: Parsing is solved. Use a parser generator (ANTLR, tree-sitter) instead of regex for nested structures.
- Context-sensitive languages: Cross-references, type checking, and semantic analysis are inherently harder. That is why compilers split lexing, parsing, and semantic analysis into separate phases.
- Recursively enumerable: Some problems have no solution. Not “we haven’t found one yet.” There is provably no solution. Your tools approximate, and that is the best anyone can do.
The practical takeaway?
Know which level your problem lives at. Use the simplest tool that works. Do not use regex for parsing HTML. Do not try to build a static analyzer that catches all bugs. Do not expect your CI to predict whether tests will pass without running them.
And when your program has been running for 20 minutes and you are wondering if it is stuck… sometimes the only honest answer is: wait and see :D.
That’s it. That’s the Chomsky hierarchy! Thanks for climbing it with me.
Stay curious, and … until next time!