From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

BUSINESS CONCEPT

Table of Contents

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

While pretraining scaling — as explored in the emerging fifth paradigm of scaling — consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining , supervised finetuning (SFT) , and RLHF .

Key Components

The Transformation

SFT turned a raw text predictor into something that could follow instructions. RLHF turned an instruction-follower into something that felt helpful, harmless, and honest.

The Escalating Economics

Post-training costs escalated rapidly. Llama 2's post-training cost $10–20 million . Llama 3.1's exceeded $50 million — despite using similar volumes of preference data.

The Structural Ceiling

These stages had fundamental limitations:

Key Insight

Post-training costs escalated rapidly. Llama 2's post-training cost $10–20 million . Llama 3.1's exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people .

Get Claude OS — The AI Strategy Skill

Exec Package + Claude OS Master Skill | Business Engineer Founding Plan

FourWeekMBA x Business Engineer | Updated 2026

While pretraining scaling consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining, supervised finetuning (SFT), and RLHF.

The Five Scaling Phases of AI — Animated Explainer

The Transformation

SFT turned a raw text predictor into something that could follow instructions. RLHF turned an instruction-follower into something that felt helpful, harmless, and honest. Together, they were the recipe that made ChatGPT — as explored in the intelligence factory race between AI labs — possible.

The Escalating Economics

Post-training costs escalated rapidly. Llama 2’s post-training cost $10–20 million. Llama 3.1’s exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people.

The Structural Ceiling

These stages had fundamental limitations:

SFT ceiling: The model can never exceed what its human demonstrators showed it
RLHF ceiling: Models learn to produce outputs that look correct rather than outputs that are correct
The reward signal is noisy (humans disagree), expensive (every label needs a paid annotator), and subjective

These weren’t fixable problems. They were structural constraints of the paradigm — and they set up the need for Phase 4 and 5.

Read the full analysis on The Business Engineer →

Frequently Asked Questions

What is From SFT to RLHF: The Thin Layers That Made ChatGPT Possible?

While pretraining scaling consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining , supervised finetuning (SFT) , and RLHF .

What is the escalating economics?

What is the structural ceiling?

These weren't fixable problems. They were structural constraints of the paradigm — and they set up the need for Phase 4 and 5.

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

The Transformation

The Escalating Economics

The Structural Ceiling

Frequently Asked Questions

Related

More Resources

About The Author

Gennaro Cuofano

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

The Transformation

The Escalating Economics

The Structural Ceiling

Frequently Asked Questions

Related

More Resources

About The Author

Gennaro Cuofano

Discover more from FourWeekMBA