Test-Time Compute: The Discovery That Models Can Think Longer, Not Just Bigger

BUSINESS MODEL

Test-Time Compute: The Discovery That Models Can Think Longer, Not Just Bigger

OpenAI's o1 model, released in late 2024, introduced a new scaling dimension entirely. Rather than investing more compute — as explored in the economics of AI compute infrastructure — in training, the model could invest more compute at inference time — "thinking" through problems step by step.

Key Components
The Revolution
This was revolutionary because it decoupled capability from model size for the first time.
The Reasoning Architecture
The model doesn't just generate answers — it generates reasoning processes :
The Economics Inversion
In Phases 1–3, the big expense was training. In Phase 4, inference cost dominates — because every hard query generates 10–100x more tokens as the model thinks.
Real-World Examples
Openai
Key Insight
In Phases 1–3, the big expense was training. In Phase 4, inference cost dominates — because every hard query generates 10–100x more tokens as the model thinks. This inverts the economics: the winner isn't who trains the biggest model, but who thinks most efficiently per dollar spent .
Exec Package + Claude OS Master Skill | Business Engineer Founding Plan
FourWeekMBA x Business Engineer | Updated 2026

OpenAI’s o1 model, released in late 2024, introduced a new scaling dimension entirely. Rather than investing more compute in training, the model could invest more compute at inference time — “thinking” through problems step by step.

The Five Scaling Phases of AI — Animated Explainer

The Revolution

This was revolutionary because it decoupled capability from model size for the first time. The same model could produce quick, cheap answers for simple questions and expensive, thorough answers for complex ones.

The “thinking time” knob created a new scaling law: capability as a function of test-time compute, independent of parameter count.

The Reasoning Architecture

The model doesn’t just generate answers — it generates reasoning processes:

  • Chain-of-thought decomposition — breaking problems into intermediate steps
  • Self-verification — checking its own work before committing
  • Backtracking — recognizing when an approach isn’t working

The Economics Inversion

In Phases 1–3, the big expense was training. In Phase 4, inference cost dominates — because every hard query generates 10–100x more tokens as the model thinks. This inverts the economics: the winner isn’t who trains the biggest model, but who thinks most efficiently per dollar spent.

Phase 4 gave models the ability to reason. Phase 5 asks: if they can think through problems, can they also act on the solutions?

Read the full analysis on The Business Engineer →

Frequently Asked Questions

What is Test-Time Compute: The Discovery That Models Can Think Longer, Not Just Bigger?
OpenAI's o1 model, released in late 2024, introduced a new scaling dimension entirely. Rather than investing more compute in training, the model could invest more compute at inference time — "thinking" through problems step by step.
What is the reasoning architecture?
The model doesn't just generate answers — it generates reasoning processes :
What is the economics inversion?
In Phases 1–3, the big expense was training. In Phase 4, inference cost dominates — because every hard query generates 10–100x more tokens as the model thinks. This inverts the economics: the winner isn't who trains the biggest model, but who thinks most efficiently per dollar spent .
Scroll to Top

Discover more from FourWeekMBA

Subscribe now to keep reading and get access to the full archive.

Continue reading

FourWeekMBA