AWS Graviton 5 Is Amazon’s Escape Route from the Nvidia Tax — 192 Cores Built for Agentic AI

Infrastructure AnalysisAWS just made Graviton 5 generally available. 192 cores on 3nm. Purpose-built for agentic AI. Meta is deploying tens of millions of cores. This is not a chip announcement. It is Amazon building its own Layer 2 to escape the Nvidia Tax.

The Specs

AWS Graviton 5 — Key Numbers

192
Cores per chip
(3nm process)
5x
Larger L3 cache
vs Graviton 4
35%
Faster ML inference
vs previous gen
DDR5
8800 MHz
fastest in cloud
PCIe 6
Latest gen
storage interface
120K+
Customers on
Graviton already

Source: AWS (June 10, 2026)

Who’s Using It

The customer list tells the story:

MetaTens of millions of cores for agentic AI
Snowflake$6B commitment to expand AWS collaboration
UberAgentic workloads on Graviton
Airbnb25% faster than competitors
Pinterest500M+ monthly users on Graviton
Siemens30%+ cost reduction on EDA tools
Epic GamesCompetitive gaming with reduced latency

Meta deploying tens of millions of Graviton cores is the most important data point. Meta just cut 8,000 jobs to fund $145B in AI capex. Part of that capex is going to Amazon’s custom silicon instead of Nvidia’s GPUs.

The Nvidia Tax Escape

In the Map of AI, Layer 2 (Compute) has been a near-monopoly. Nvidia holds 75% of AI compute spend at 75% gross margins. Every hyperscaler pays the Nvidia Tax.

But the five biggest Nvidia customers are also the five biggest Nvidia competitors:

Custom Silicon vs Nvidia — The Layer 2 Diversification

AmazonGraviton 5 (CPU) + Trainium 3 (AI accelerator)
GoogleTPU v6 (Ironwood) — escaping Nvidia entirely
MicrosoftMaia 100 + Cobalt 100 — Azure custom silicon
MetaMTIA + now Graviton — dual-sourcing away from Nvidia
AppleApple Silicon — on-device AI, zero Nvidia dependency

Custom silicon growing 3x faster than Nvidia GPUs. The Nvidia Tax has an expiry date.

Graviton 5 is not competing with Nvidia’s H100/B200 for training frontier models. It is competing for everything else: inference, agentic workloads, databases, web applications, EDA tools. These workloads represent the majority of cloud compute spend — and they don’t need Nvidia’s margins.

The Agentic AI Angle

AWS explicitly designed Graviton 5 for agentic AI — “real-time reasoning, code generation, multi-step task orchestration.” This connects directly to Apple’s Agent OS Bet and the broader thesis that the agent is becoming the computer.

The structural insight: AI agents don’t need GPUs. They need CPUs with massive cache, fast memory, and low latency — exactly what Graviton 5 provides. The agent orchestration layer (deciding which model to call, routing queries, managing context) runs on CPUs, not GPUs.

This is why Amazon built Graviton 5 with 5x larger cache and 33% lower inter-core latency. The agent doesn’t train models — it coordinates them. And coordination is a CPU workload.

The Goldman $7.6T Read

Goldman’s $7.6 trillion AI capex projection assumes Nvidia at 75% of compute spend. If custom silicon (Graviton, TPU, Maia, MTIA) captures even 10% of that over six years, that is $380 billion that doesn’t flow through Nvidia.

Graviton 5 going GA with Meta as a flagship customer is the most concrete signal yet that the Nvidia Tax is being eroded from within — by the very customers who pay it.

Related:
Beyond the Nvidia Tax
Goldman Sachs: Where $7.6 Trillion Goes
Map of AI
Apple’s Agent OS Bet

Sources: AWS About Amazon (June 10, 2026), Airbnb, Siemens, Atlassian, Snowflake, Epic Games benchmarks

Scroll to Top

Discover more from FourWeekMBA

Subscribe now to keep reading and get access to the full archive.

Continue reading

FourWeekMBA