The Kaplan Era: When "Just Make It Bigger" Launched the AI Revolution

BUSINESS CONCEPT

Table of Contents

The Kaplan Era: When "Just Make It Bigger" Launched the AI Revolution

In 2020, OpenAI published the original scaling laws alongside GPT-3 and established the first quantitative framework for AI capability growth. The thesis was straightforward: performance scales as a power law with model size .

Key Components

The Numbers

GPT-3 used 175 billion parameters trained on 300 billion tokens — a ratio of roughly 1.7 tokens per parameter. The assumption: model size mattered more than data volume.

What It Got Right

The Kaplan paper proved that capability scales predictably with compute. GPT-3's few-shot abilities genuinely surprised researchers and launched the generative AI wave.

The Critical Blind Spot

The scaling law dramatically undervalued data relative to parameters . Models were large but undertrained.

Real-World Examples

Openai Deepmind

Key Insight

The scaling law dramatically undervalued data relative to parameters . Models were large but undertrained. It would take DeepMind's Chinchilla paper two years later to reveal just how wrong the allocation was — GPT-3 should have been trained on 11x more data or been 20x smaller .

Get Claude OS — The AI Strategy Skill

Exec Package + Claude OS Master Skill | Business Engineer Founding Plan

FourWeekMBA x Business Engineer | Updated 2026

In 2020, OpenAI published the original scaling laws alongside GPT-3 and established the first quantitative framework for AI capability growth. The thesis was straightforward: performance scales as a power law with model size.

The Five Scaling Phases of AI — Animated Explainer

The Numbers

GPT-3 used 175 billion parameters trained on 300 billion tokens — a ratio of roughly 1.7 tokens per parameter. The assumption: model size mattered more than data volume.

The industry responded accordingly. The race was to build the biggest model possible within a given compute budget.

What It Got Right

The Kaplan paper proved that capability scales predictably with compute. GPT-3’s few-shot abilities genuinely surprised researchers and launched the generative AI wave. Scaling wasn’t a guess anymore — it was an infrastructure blueprint.

The Critical Blind Spot

The scaling law dramatically undervalued data relative to parameters. Models were large but undertrained. It would take DeepMind’s Chinchilla paper two years later to reveal just how wrong the allocation was — GPT-3 should have been trained on 11x more data or been 20x smaller.

The Kaplan era wasn’t just a research finding — it was the largest capital deployment in computing history. Every GPU cluster built, every training run funded, followed this blueprint.

Read the full analysis on The Business Engineer →

Frequently Asked Questions

What is The Kaplan Era: When "Just Make It Bigger" Launched the AI Revolution?

What is What It Got Right?

The Kaplan paper proved that capability scales predictably with compute. GPT-3's few-shot abilities genuinely surprised researchers and launched the generative AI wave. Scaling wasn't a guess anymore — it was an infrastructure blueprint .

What is the critical blind spot?

The Kaplan Era: When “Just Make It Bigger” Launched the AI Revolution

The Kaplan Era: When "Just Make It Bigger" Launched the AI Revolution

The Numbers

What It Got Right

The Critical Blind Spot

Frequently Asked Questions

Related

More Resources

About The Author

Gennaro Cuofano

The Kaplan Era: When "Just Make It Bigger" Launched the AI Revolution

The Numbers

What It Got Right

The Critical Blind Spot

Frequently Asked Questions

Related

More Resources

About The Author

Gennaro Cuofano

Discover more from FourWeekMBA