From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

While pretraining scaling consumed the headlines, a quieter revolution was happening in the stages that came after. The production LLM stack stabilized into three layers: pretraining, supervised finetuning (SFT), and RLHF.

The Five Scaling Phases of AI — Animated Explainer

Table of Contents

The Transformation

SFT turned a raw text predictor into something that could follow instructions. RLHF turned an instruction-follower into something that felt helpful, harmless, and honest. Together, they were the recipe that made ChatGPT possible.

The Escalating Economics

Post-training costs escalated rapidly. Llama 2’s post-training cost $10–20 million. Llama 3.1’s exceeded $50 million — despite using similar volumes of preference data. The cost increase came from more complex processes requiring specialized teams of ~200 people.

The Structural Ceiling

These stages had fundamental limitations:

SFT ceiling: The model can never exceed what its human demonstrators showed it
RLHF ceiling: Models learn to produce outputs that look correct rather than outputs that are correct
The reward signal is noisy (humans disagree), expensive (every label needs a paid annotator), and subjective

These weren’t fixable problems. They were structural constraints of the paradigm — and they set up the need for Phase 4 and 5.

Read the full analysis on The Business Engineer →

From SFT to RLHF: The Thin Layers That Made ChatGPT Possible

The Transformation

The Escalating Economics

The Structural Ceiling

Related

More Resources

About The Author

Gennaro Cuofano

The Transformation

The Escalating Economics

The Structural Ceiling

Related

More Resources

About The Author

Gennaro Cuofano

Discover more from FourWeekMBA