The AI Memory Tax & the Bifurcation of AI Scaling Laws

For most of semiconductor — as explored in the economics of AI compute infrastructure — history, memory was the unglamorous sibling of compute. Intel and later NVIDIA carried the branded, premium, era-defining narratives.

Memory was made in Korea and Japan, sold as a commodity, and lived through the most brutal cycles in tech — peak-to-trough revenue swings routinely above 50%, suppliers wiped out every decade, the survivors learning to bleed quietly through downturns and harvest aggressively through ups.

The industry consolidated from dozens of players in the 1990s to three by the 2010s, and even those three traded at low multiples because the market priced them as cyclical commodity producers rather than strategic assets.

The first was an obscure academic prediction. In 1995, computer architects Wulf and McKee published a paper titled “Hitting the Memory Wall,” arguing that processor speed was improving exponentially while memory access latency was barely moving, and eventually applications would stall waiting for data rather than computation. For thirty years the prediction was directionally right but commercially manageable — software architects partitioned around the wall, caches got bigger, problems got reshaped.