
For years, the AI race was governed by a simple formula: performance was a function of parameters, data, and compute. Add more GPUs, feed in more tokens, expand the model size, and performance climbed.
That law — elegant in its simplicity — drove the exponential rise of large language models. It explained why each generation of GPT, PaLM, or Gemini looked like a straightforward leap: more parameters, more training data, more compute.
But the curve is bending. We are entering a new scaling regime, one where the old formula no longer captures the real drivers of capability.
From Traditional to Multidimensional Scaling
The traditional law:
Performance = f(Parameters, Data, Compute)
The emerging law:
Performance = f(Parameters, Data, Compute, Memory, Context)
The shift may look subtle — two additional terms. But the implications are profound. They signal that AI capability now depends less on size, and more on structure.
The Five Dimensions of Scale
1. Parameters: The Old Benchmark
Model size was once the industry’s obsession. Bigger was better: 7B to 175B to a trillion. Parameters became a proxy for power, a convenient marketing metric.
But we’ve learned that bigger is not always smarter. Beyond a threshold, returns diminish, and cost curves explode. Parameters still matter, but they no longer dominate.
2. Data: The Fuel Reservoir
The size and quality of the training corpus remain crucial. Models trained on narrow or poor-quality data hit ceilings quickly.
Yet we’ve also reached a limit: the open web is finite. Much of it is noisy or duplicative. This forces a pivot toward curated data, synthetic data, and reinforcement from human feedback (RLHF) as the new sources of fuel.
3. Compute: The Power Constraint
The raw FLOPs and GPU hours that underpin scaling remain non-negotiable. Compute is the hard floor beneath all progress.
But here, too, constraints bite. GPU supply is finite. Energy demands are escalating. Even hyperscalers face binding limits. Compute remains essential, but it is becoming the bottleneck — the rate limiter of the scaling law.
4. Memory: The New Layer of Persistence
This is the first of the new terms. Persistent memory transforms AI from a brilliant amnesiac into a learning partner.
Instead of starting fresh with every prompt, agents can remember:
- Past interactions
- Preferences
- Evolving knowledge
Memory turns sessions into relationships, and single tasks into long-term projects. It also introduces new complexity: what to remember, how to store it, how to protect it.
But strategically, memory shifts AI from static models to adaptive systems.
5. Context: The Window of Awareness
The second new term is context. Expanded context windows — 32k, 128k, 1M tokens — radically alter what models can handle.
Where once models could only “see” a paragraph or page, now they can ingest books, datasets, and multi-document corpora in a single pass. This unlocks:
- Cross-document synthesis
- Long-form reasoning
- Domain integration
Context expansion isn’t just more input. It’s a new dimension of reasoning.
Why This Evolution Matters
The move from a 3D to a 5D scaling law reframes the entire AI playbook. Three key implications stand out:
1. Capabilities Compound
Memory and context don’t just add power — they multiply it. Together, they enable emergent behaviors:
- Strategic planning across sessions
- Task continuity over weeks or months
- Relationship-building with users
- Self-model development (understanding limits, offering proactive suggestions)
These aren’t linear gains. They’re phase transitions — thresholds where new intelligence emerges.
2. The Bottlenecks Shift
In the old law, compute was the dominant constraint. In the new law, the bottleneck is coherence.
- Attention problems: How to keep focus across massive contexts
- Integration problems: How to merge past memory with present context
- Consistency paradoxes: How to reconcile contradictions across time
These challenges are harder than adding GPUs. They’re architectural, not just infrastructural.
3. The Competitive Edge Moves
If the old race was about who could afford the most compute, the new race is about who can design coherence.
Winners will be the companies that can:
- Build scalable memory architectures
- Develop dynamic attention mechanisms
- Manage contradictions without losing trust
- Deliver continuity at sustainable cost
In other words: it’s no longer a race to be biggest. It’s a race to be most coherent.
Strategic Framing
Think of the shift in terms of industry epochs:
- First Epoch: Scale by Size
Bigger models trained on more data with more GPUs. - Second Epoch: Scale by Structure
Models enhanced by memory and context, with coherence as the binding constraint.
We are in the middle of this transition. The companies that adapt fastest will define the frontier.
Closing Thought
The story of AI scaling is no longer one of brute force. It is one of architecture.
Memory and context add two new axes that reshape the entire performance frontier. They unlock emergent intelligence but also expose coherence as the critical bottleneck.
The new scaling laws don’t just change how we measure progress. They change what progress means.
And in that lies the future of AI: not more parameters, but more dimensions of intelligence.









