OpenAI Unveils Jalapeño — Its First AI Chip, Built With Broadcom in 9 Months, 50% Cheaper Than GPUs

OpenAI just unveiled its first custom AI chip — Jalapeño — built with Broadcom in nine months. Purpose-built for LLM inference, it delivers 50% cost savings vs GPUs. OpenAI is no longer just a model company. It’s building the full stack — from products to models to silicon.

Jalapeño — First Numbers

50%

Cost savings vs current GPUs

9 mo

Design to tape-out — fastest ASIC cycle ever

2026

Initial deployment by end of year

LLM

Purpose-built for inference, not training

Table of Contents

What OpenAI Built

OpenAI designed Jalapeño from the ground up — a custom inference accelerator architected specifically for the LLM workloads powering ChatGPT, Codex, the API, and future agentic products. Broadcom handled manufacturing. The chip went from initial design to tape-out in nine months — what OpenAI calls the fastest ASIC development cycle ever in high-performance semiconductors.

Business Engineer

The intelligence factory — OpenAI vs the field

Read →

Business Engineer

Beyond the Nvidia Tax — the compute layer economics

Read →

Early testing shows 50% cost savings compared to current GPUs and substantially better performance per watt. First samples are being tested now. Initial deployment is planned for end of 2026, with this being the first chip in a multi-generation compute platform.

The key insight: OpenAI burns $3.7 billion per quarter. Inference is the largest cost. A chip that cuts inference cost by 50% doesn’t just save money — it changes the entire economics of serving 400M+ ChatGPT users. And it’s the answer to the $100B ad business: cheaper inference = lower floor for free-tier users = more ad impressions.

The Full Stack Play

OpenAI explicitly framed this as building the “full stack” — products (ChatGPT, Codex) → models (GPT series) → infrastructure (Jalapeño). This is the same vertical integration playbook running across the industry this week:

SpaceX: Models (xAI) + Compute (Colossus) + Dev Tools (Cursor) + Robotics (Tesla) + Connectivity (Starlink)

Anthropic: Models (Claude) + Memory supply (Micron deal) + Government trust (Glasswing) + Series H capital

Google: Models (Gemini) + Chips (TPUs) + Cloud (GCP) + Distribution (Search/Android) + Content (A24)

OpenAI (now): Models (GPT) + Chips (Jalapeño) + Dev Tools (Codex) + Distribution (ChatGPT 400M users) + Revenue (Ads + Subs)

The Structural Read

NVIDIA’S PRICING POWER JUST GOT CHALLENGED

OpenAI is Nvidia’s largest customer. A custom chip that cuts inference costs 50% is a direct shot at Nvidia’s margin structure. OpenAI won’t stop buying Nvidia GPUs for training — but every dollar of inference that moves to Jalapeño is a dollar Nvidia doesn’t get. This is why Nvidia absorbed Groq’s IP for $20B — to prevent exactly this kind of competition from emerging.

THE IPO NARRATIVE JUST GOT ANOTHER CHAPTER

This week OpenAI revealed: $100B ad target, GPT-5.5-Cyber + Daybreak, and now custom silicon. Each announcement builds the IPO story: not a model company, but a full-stack AI platform with its own chips, its own ad engine, and its own security program. That’s the pitch to public markets.

THE INFERENCE ECONOMICS CHANGE EVERYTHING

50% cheaper inference means: more free-tier users (more ad impressions), cheaper API (more developers), faster agents (more agentic products), and a viable path to profitability. The $3.7B quarterly burn was always an inference cost problem. Jalapeño is the structural answer — not more revenue, but less cost per query.

The Bottom Line

OpenAI just announced it built a chip in nine months that cuts inference costs in half. Nine months. From a company that didn’t have a hardware team two years ago. Jalapeño isn’t a moonshot — it’s a business necessity. When you’re burning $3.7 billion a quarter serving 400 million users, you either cut the cost of serving them or you run out of money. OpenAI chose to build the solution rather than rent it from Nvidia. That’s the most consequential strategic decision the company has made since launching ChatGPT — and it happened in nine months.

Business Engineer

The AI Supercycle — When the Model Layer Builds Its Own Chips

Read the AI Supercycle →

Sources: OpenAI, CNBC, Bloomberg — June 24, 2026

OpenAI Unveils Jalapeño — Its First AI Chip, Built With Broadcom in 9 Months, 50% Cheaper Than GPUs

What OpenAI Built

The Full Stack Play

The Structural Read

The Bottom Line

Related

More Resources

About The Author

Gennaro Cuofano

What OpenAI Built

The Full Stack Play

The Structural Read

The Bottom Line

Related

More Resources

About The Author

Gennaro Cuofano

Discover more from FourWeekMBA