AWS Graviton 5 Is Amazon's Escape Route from the Nvidia Tax — 192 Cores Built for Agentic AI

Infrastructure Analysis — AWS just made Graviton 5 generally available. 192 cores on 3nm. Purpose-built for agentic AI. Meta is deploying tens of millions of cores. This is not a chip announcement. It is Amazon building its own Layer 2 to escape the Nvidia Tax.

Table of Contents

The Specs

AWS Graviton 5 — Key Numbers

192

Cores per chip
(3nm process)

Larger L3 cache
vs Graviton 4

35%

Faster ML inference
vs previous gen

DDR5

8800 MHz
fastest in cloud

PCIe 6

Latest gen
storage interface

120K+

Customers on
Graviton already

Source: AWS (June 10, 2026)

Who’s Using It

The customer list tells the story:

MetaTens of millions of cores for agentic AI

Snowflake$6B commitment to expand AWS collaboration

UberAgentic workloads on Graviton

Airbnb25% faster than competitors

Pinterest500M+ monthly users on Graviton

Siemens30%+ cost reduction on EDA tools

Epic GamesCompetitive gaming with reduced latency

Meta deploying tens of millions of Graviton cores is the most important data point. Meta just cut 8,000 jobs to fund $145B in AI capex. Part of that capex is going to Amazon’s custom silicon instead of Nvidia’s GPUs.

The Nvidia Tax Escape

In the Map of AI, Layer 2 (Compute) has been a near-monopoly. Nvidia holds 75% of AI compute spend at 75% gross margins. Every hyperscaler pays the Nvidia Tax.

But the five biggest Nvidia customers are also the five biggest Nvidia competitors:

Custom Silicon vs Nvidia — The Layer 2 Diversification

AmazonGraviton 5 (CPU) + Trainium 3 (AI accelerator)

GoogleTPU v6 (Ironwood) — escaping Nvidia entirely

MicrosoftMaia 100 + Cobalt 100 — Azure custom silicon

MetaMTIA + now Graviton — dual-sourcing away from Nvidia

AppleApple Silicon — on-device AI, zero Nvidia dependency

Custom silicon growing 3x faster than Nvidia GPUs. The Nvidia Tax has an expiry date.

Graviton 5 is not competing with Nvidia’s H100/B200 for training frontier models. It is competing for everything else: inference, agentic workloads, databases, web applications, EDA tools. These workloads represent the majority of cloud compute spend — and they don’t need Nvidia’s margins.

The Agentic AI Angle

AWS explicitly designed Graviton 5 for agentic AI — “real-time reasoning, code generation, multi-step task orchestration.” This connects directly to Apple’s Agent OS Bet and the broader thesis that the agent is becoming the computer.

The structural insight: AI agents don’t need GPUs. They need CPUs with massive cache, fast memory, and low latency — exactly what Graviton 5 provides. The agent orchestration layer (deciding which model to call, routing queries, managing context) runs on CPUs, not GPUs.

This is why Amazon built Graviton 5 with 5x larger cache and 33% lower inter-core latency. The agent doesn’t train models — it coordinates them. And coordination is a CPU workload.

The Goldman $7.6T Read

Goldman’s $7.6 trillion AI capex projection assumes Nvidia at 75% of compute spend. If custom silicon (Graviton, TPU, Maia, MTIA) captures even 10% of that over six years, that is $380 billion that doesn’t flow through Nvidia.

Graviton 5 going GA with Meta as a flagship customer is the most concrete signal yet that the Nvidia Tax is being eroded from within — by the very customers who pay it.

Sources: AWS About Amazon (June 10, 2026), Airbnb, Siemens, Atlassian, Snowflake, Epic Games benchmarks

AWS Graviton 5 Is Amazon’s Escape Route from the Nvidia Tax — 192 Cores Built for Agentic AI

The Specs

AWS Graviton 5 — Key Numbers

Who’s Using It

The Nvidia Tax Escape

Custom Silicon vs Nvidia — The Layer 2 Diversification

The Agentic AI Angle

The Goldman $7.6T Read

Related

More Resources

About The Author

Gennaro Cuofano

The Specs

AWS Graviton 5 — Key Numbers

Who’s Using It

The Nvidia Tax Escape

Custom Silicon vs Nvidia — The Layer 2 Diversification

The Agentic AI Angle

The Goldman $7.6T Read

Related

More Resources

About The Author

Gennaro Cuofano

Discover more from FourWeekMBA