From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

BUSINESS CONCEPT

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

While pretraining scaling — as explored in the emerging fifth paradigm of scaling — consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining , supervised finetuning (SFT) , and RLHF .

Key Components
The Transformation
SFT turned a raw text predictor into something that could follow instructions. RLHF turned an instruction-follower into something that felt helpful, harmless, and honest.
The Escalating Economics
Post-training costs escalated rapidly. Llama 2's post-training cost $10–20 million . Llama 3.1's exceeded $50 million — despite using similar volumes of preference data.
The Structural Ceiling
These stages had fundamental limitations:
Key Insight
Post-training costs escalated rapidly. Llama 2's post-training cost $10–20 million . Llama 3.1's exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people .
Exec Package + Claude OS Master Skill | Business Engineer Founding Plan
FourWeekMBA x Business Engineer | Updated 2026

While pretraining scaling consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining, supervised finetuning (SFT), and RLHF.

The Five Scaling Phases of AI — Animated Explainer

The Transformation

SFT turned a raw text predictor into something that could follow instructions. RLHF turned an instruction-follower into something that felt helpful, harmless, and honest. Together, they were the recipe that made ChatGPT — as explored in the intelligence factory race between AI labs — possible.

The Escalating Economics

Post-training costs escalated rapidly. Llama 2’s post-training cost $10–20 million. Llama 3.1’s exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people.

The Structural Ceiling

These stages had fundamental limitations:

  • SFT ceiling: The model can never exceed what its human demonstrators showed it
  • RLHF ceiling: Models learn to produce outputs that look correct rather than outputs that are correct
  • The reward signal is noisy (humans disagree), expensive (every label needs a paid annotator), and subjective

These weren’t fixable problems. They were structural constraints of the paradigm — and they set up the need for Phase 4 and 5.

Read the full analysis on The Business Engineer →

Frequently Asked Questions

What is From SFT to RLHF: The Thin Layers That Made ChatGPT Possible?
While pretraining scaling consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining , supervised finetuning (SFT) , and RLHF .
What is the escalating economics?
Post-training costs escalated rapidly. Llama 2's post-training cost $10–20 million . Llama 3.1's exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people .
What is the structural ceiling?
These weren't fixable problems. They were structural constraints of the paradigm — and they set up the need for Phase 4 and 5.
Scroll to Top

Discover more from FourWeekMBA

Subscribe now to keep reading and get access to the full archive.

Continue reading

FourWeekMBA