AI Token Market Share: Big 3 Drop from 72% to 33% as Open-Source Takes Over

In twelve months, Google, OpenAI, and Anthropic watched their combined share of AI token consumption collapse from 72% to 33%. The open-source insurgency isn’t expanding the market — it’s eating the incumbents alive.

TOKEN MARKET SHARE COLLAPSE — JUNE 2025 → JUNE 2026

72%

Big 3 share — June 2025

33%

Big 3 share — June 2026

39 pts

Share lost in 12 months

$25B

Global AI sales Q1 2026 (ex-China)

What Happened

For the past two years, the narrative was straightforward: AI is OpenAI, Google, and Anthropic. Three labs, three foundation models, a combined token share that made everyone else a rounding error.

That story ended sometime in the last twelve months. Token consumption data through June 2026 shows the Big 3’s combined share at 33% — down from 72% in June 2025. Nearly 40 percentage points transferred to open-source and Chinese models in a single year. This isn’t a gradual erosion. It’s a structural break.

The challengers doing the taking: DeepSeek, GLM-5.2, and Meta’s Llama family. Each arriving at a fraction of the cost — and increasingly, at comparable or superior performance on specific tasks.

THE 12-MONTH ARC

June 2025

Big 3 at 72% token share. Open-source treated as a research curiosity, not a production threat.

Q4 2025

DeepSeek V4-Pro launches with 1.6T parameters running entirely on Huawei chips — zero Nvidia. Benchmark parity with GPT-4o-class models at 80% lower API cost.

Q1 2026

GLM-5.2 releases. Vercel CEO publicly states he was “shocked” — GLM-5.2 beats GPT-5.5 on coding benchmarks at 1/6th the cost. Enterprise routing begins in earnest.

Q2 2026

OpenAI responds with three new tiers: GPT-5.6 Sol ($5/$30), Terra (half cost), Luna (cheapest ever). The pricing compression confirms the threat is real.

June 2026

Big 3 token share: 33%. Forty points surrendered in twelve months. Global AI sales hit $25B in Q1 2026 (ex-China) — but margins are thin; depreciation consumes two-thirds of revenue.

The Structural Read

The Jevons Paradox predicts that efficiency gains in resource consumption lead to increased total consumption — not decrease. Applied to AI, cheaper tokens were supposed to expand the market for everyone, including the Big 3.

That’s not what’s happening.

What’s playing out is a Cognitive Jevons Paradox in reverse: cheaper open models aren’t just expanding the pie — they’re capturing share that previously belonged to premium providers. The CFO-level logic is simple: when GLM-5.2 beats GPT-5.5 on coding at 1/6th the cost, you don’t buy both. You route the workload and capture the arbitrage.

The key insight: Token arbitraging — routing workloads to the cheapest capable model — is now a CFO-level strategy. The enterprise that mastered tokenmaxxing in 2024 is now mastering token arbitrage in 2026.

DeepSeek’s V4-Pro adds another dimension: geopolitical independence. Running 1.6 trillion parameters on Huawei chips with zero Nvidia dependency isn’t just a cost story — it’s a supply chain thesis. For non-US enterprises with compliance constraints or strategic risk aversion, DeepSeek isn’t an alternative. It’s the only option.

OpenAI’s response — launching GPT-5.6 Sol, Terra, and Luna in quick succession — reads like a company that suddenly understands it’s in a price war it didn’t expect to fight. Each tier compresses margins further. Global AI revenues at $25B (Q1 2026, ex-China) sound large until you factor in depreciation consuming two-thirds of that figure. The premium pricing era is over. The infrastructure-as-commodity era has arrived.

TOKEN SHARE — JUNE 2026

Open-Source (DeepSeek / GLM / Llama) 67%
Big 3 (OpenAI / Google / Anthropic) 33%

Three Implications

WINNERS: INFRASTRUCTURE HARNESSERS

Companies that treat AI models as interchangeable compute — routing, fine-tuning, and arbitraging across providers — will compress costs while maintaining output quality. The model layer is becoming undifferentiated infrastructure. The harness layer is where the value accrues.

PRESSURE: PREMIUM MODEL PROVIDERS

OpenAI’s three-tier launch confirms the company knows this. But price compression at the API layer destroys the margin that funds frontier research. The Big 3 are caught in a structurally impossible position: compete on price (kills R&D) or defend premium (loses share).

WATCH: GEOPOLITICAL MODEL ROUTING

DeepSeek V4-Pro running on Huawei silicon isn’t just a Chinese AI story — it’s a signal that export control-constrained enterprises will develop sovereign AI stacks. The Nvidia monopoly on AI inference has already cracked at the foundation model level.

Vercel CEO

“I was shocked. GLM-5.2 beats GPT-5.5 at coding — at one-sixth the cost. We’re routing to it now.”

Business Engineer Framework

The CFO’s Guide to Token Arbitrage

The enterprise AI playbook has shifted from tokenmaxxing (using AI as much as possible) to token arbitraging (routing workloads to the cheapest capable model). The Business Engineer’s CFO Guide maps exactly how to build a multi-model routing stack — and what the Big 3’s market share collapse means for your AI budget.

Read the CFO Guide →

The Bottom Line

The Big 3 didn’t just lose 40 points of token market share in a year — they lost the narrative. AI is no longer synonymous with OpenAI, Google, or Anthropic. The open-source insurgency has crossed the credibility threshold where “good enough” has become “better at coding for less than half the price.” Every CFO who reads the GLM-5.2 or DeepSeek benchmarks and doesn’t adjust their model routing strategy is leaving margin on the table. The tokenmaxxing era is over. Token arbitraging is the new enterprise AI competitive advantage.

Sources: Token consumption data aggregated from API platforms, June 2025–June 2026. Vercel CEO statement. GLM-5.2 benchmarks via HuggingFace. DeepSeek V4-Pro announcement. Global AI revenue data: Q1 2026 industry estimates, ex-China.

Scroll to Top

Discover more from FourWeekMBA

Subscribe now to keep reading and get access to the full archive.

Continue reading

FourWeekMBA