In twelve months, Google, OpenAI, and Anthropic watched their combined share of AI token consumption collapse from 72% to 33%. The open-source insurgency isn’t expanding the market — it’s eating the incumbents alive.
What Happened
For the past two years, the narrative was straightforward: AI is OpenAI, Google, and Anthropic. Three labs, three foundation models, a combined token share that made everyone else a rounding error.
That story ended sometime in the last twelve months. Token consumption data through June 2026 shows the Big 3’s combined share at 33% — down from 72% in June 2025. Nearly 40 percentage points transferred to open-source and Chinese models in a single year. This isn’t a gradual erosion. It’s a structural break.
The challengers doing the taking: DeepSeek, GLM-5.2, and Meta’s Llama family. Each arriving at a fraction of the cost — and increasingly, at comparable or superior performance on specific tasks.
The Structural Read
The Jevons Paradox predicts that efficiency gains in resource consumption lead to increased total consumption — not decrease. Applied to AI, cheaper tokens were supposed to expand the market for everyone, including the Big 3.
That’s not what’s happening.
What’s playing out is a Cognitive Jevons Paradox in reverse: cheaper open models aren’t just expanding the pie — they’re capturing share that previously belonged to premium providers. The CFO-level logic is simple: when GLM-5.2 beats GPT-5.5 on coding at 1/6th the cost, you don’t buy both. You route the workload and capture the arbitrage.
The key insight: Token arbitraging — routing workloads to the cheapest capable model — is now a CFO-level strategy. The enterprise that mastered tokenmaxxing in 2024 is now mastering token arbitrage in 2026.
DeepSeek’s V4-Pro adds another dimension: geopolitical independence. Running 1.6 trillion parameters on Huawei chips with zero Nvidia dependency isn’t just a cost story — it’s a supply chain thesis. For non-US enterprises with compliance constraints or strategic risk aversion, DeepSeek isn’t an alternative. It’s the only option.
OpenAI’s response — launching GPT-5.6 Sol, Terra, and Luna in quick succession — reads like a company that suddenly understands it’s in a price war it didn’t expect to fight. Each tier compresses margins further. Global AI revenues at $25B (Q1 2026, ex-China) sound large until you factor in depreciation consuming two-thirds of that figure. The premium pricing era is over. The infrastructure-as-commodity era has arrived.
Three Implications
WINNERS: INFRASTRUCTURE HARNESSERS
Companies that treat AI models as interchangeable compute — routing, fine-tuning, and arbitraging across providers — will compress costs while maintaining output quality. The model layer is becoming undifferentiated infrastructure. The harness layer is where the value accrues.
PRESSURE: PREMIUM MODEL PROVIDERS
OpenAI’s three-tier launch confirms the company knows this. But price compression at the API layer destroys the margin that funds frontier research. The Big 3 are caught in a structurally impossible position: compete on price (kills R&D) or defend premium (loses share).
Vercel CEO
“I was shocked. GLM-5.2 beats GPT-5.5 at coding — at one-sixth the cost. We’re routing to it now.”
The Bottom Line
The Big 3 didn’t just lose 40 points of token market share in a year — they lost the narrative. AI is no longer synonymous with OpenAI, Google, or Anthropic. The open-source insurgency has crossed the credibility threshold where “good enough” has become “better at coding for less than half the price.” Every CFO who reads the GLM-5.2 or DeepSeek benchmarks and doesn’t adjust their model routing strategy is leaving margin on the table. The tokenmaxxing era is over. Token arbitraging is the new enterprise AI competitive advantage.
Sources: Token consumption data aggregated from API platforms, June 2025–June 2026. Vercel CEO statement. GLM-5.2 benchmarks via HuggingFace. DeepSeek V4-Pro announcement. Global AI revenue data: Q1 2026 industry estimates, ex-China.








