spoonai
TOPOpenAIBroadcomAI Chips

OpenAI Taped Out Its First Chip, 'Jalapeño', in Just 9 Months — A Direct Jab at Nvidia

OpenAI unveiled Jalapeño, its first custom AI inference chip co-built with Broadcom. From design start to tape-out in just nine months — one of the fastest high-performance ASIC cycles ever. It's a blank-slate chip built only for LLM inference, deploying at gigawatt scale from late 2026. This is OpenAI's most concrete move yet to loosen its dependence on Nvidia.

·11분 소요
공유
AI 데이터센터 GPU 서버랙
Unsplash

The company that rented its GPUs finally taped out its own chip

Here's the deal: on June 24, OpenAI unveiled Jalapeño, its first custom AI inference chip, co-developed with Broadcom. The name earns its heat. Until now, everything OpenAI ran — ChatGPT, Codex, the API — sat on Nvidia GPUs. Now there's an actual piece of silicon that OpenAI designed itself, for its own models.

The most jaw-dropping number isn't performance — it's speed. From design kickoff to tape-out (handing the design to the fab) took just nine months. In high-performance silicon, getting a brand-new chip from blank page to near-production usually takes two to three years. Nine months for a full reticle-sized ASIC is one of the fastest cycles the industry has seen.

Why is this big? Two currents collide here. First, the AI industry's biggest bottleneck has shifted from "models" to "compute" — whoever runs inference cheapest and fastest wins. Second, that compute has been gated almost entirely by Nvidia. OpenAI taping out its own chip signals its move from "model company" to "company that builds its own infrastructure," and it's the first hammer-strike against Nvidia's grip.

So today's story: what Jalapeño actually is, what OpenAI and Broadcom each get, how nine months was even possible, and what it throws into the chip landscape that Nvidia, Google, and Amazon have carved up. Grab three characters and the picture snaps into focus.

The cast — OpenAI, Broadcom, and a chip called Jalapeño

First, OpenAI. No introduction needed. What matters is that its core problem has changed. The goal used to be "a smarter model." Now the real battlefield is "how do we run that model cheaply and reliably for hundreds of millions of people?" Inference cost dictates the P&L, and GPU supply dictates how fast you can grow. Renting someone else's silicon was a structural weakness.

Next, Broadcom. Unfamiliar to consumers, but a quiet heavyweight. Its core isn't selling its own chips — it's custom silicon (ASICs): when a giant customer wants "a chip of their own," Broadcom designs and manufactures it for them. Parts of Google's TPU are reported to have passed through Broadcom's hands. Jalapeño marks Broadcom formally landing OpenAI as a customer.

Third, today's protagonist, Jalapeño. OpenAI calls it an "Intelligence Processor." The key point: this is not "a GPU repurposed for AI." It's a blank-slate inference chip built around what actually matters for modern LLM inference — kernels, memory movement, networking, serving patterns. It's specialized for inference, not training.

Tie the three together in one sentence: the company that has run models at the largest scale (OpenAI) partnered with the company that builds custom chips best (Broadcom) to tape out an inference-only chip (Jalapeño) — designed using its own models' real-world data — in just nine months. That's the spine of the story.

What was actually announced

Words scatter, so here are the confirmed facts in a table.

Item Detail
Chip name Jalapeño (OpenAI "Intelligence Processor")
Unveiled June 24, 2026
Partner Broadcom (co-design + manufacturing)
Purpose LLM inference only (not training)
Design Blank-slate, built from scratch
Dev cycle Design start → tape-out in just 9 months
Performance Perf/watt "substantially better than current state-of-the-art"
Validation Engineering samples running in-lab at target frequency/power, including GPT-5.3-Codex-Spark
Deployment Initial rollout end of 2026, expanding after
Scale Gigawatt-scale data centers with Microsoft and partners

Line by line. First, the "inference-only" framing is the crux. AI chips split into training (teaching models) and inference (running finished models), and the money actually bleeds on inference — every time hundreds of millions of people query ChatGPT. Jalapeño targets exactly the most expensive seat: "I'll swap my own silicon into the costliest spot first."

Second, the "blank-slate" design and "we used our own models to speed it up" line is striking. OpenAI defined requirements from running ChatGPT, Codex, the API, and agents at scale — and even used GPT models to optimize the chip design. AI helped design the chip that runs AI better. That's one piece of the nine-month secret.

Third, be honest that performance is still qualitative. OpenAI only said perf/watt is "substantially better than current SOTA" — no hard benchmark numbers. But engineering samples running GPT-5.3-Codex-Spark in the lab at spec means this is real silicon, not a paper chip. There's a thing, not just a pitch.

What each side gets

OpenAI's win first. One, cost control — running inference on its own chip lets it cut per-query cost instead of renting GPUs at a premium. Two, supply stability — Nvidia GPUs are chronically short and queued; an in-house line lets OpenAI unblock the "can't grow because there's no compute" bottleneck itself. Three, design sovereignty — co-designing model and chip lets you optimize both as one body, the same edge Apple gets from owning chip and OS.

Broadcom's win is just as clear. It formally landed OpenAI as an anchor customer, and crucially this is the first generation of a "multi-generation compute platform." Once you start building chips together, the relationship locks in for gen two and three. After Google's TPU, Broadcom cements the position that "the custom silicon of the AI era gets built here." That's why the market eyed Broadcom as a hidden winner on announcement day.

The surprise beneficiary — and tension point — is Microsoft and partners. Broadcom said gigawatt-scale data center deployment with Microsoft and others begins in 2026, meaning Jalapeño isn't an experiment but goes into real, massive infrastructure. With OpenAI's chip running on Microsoft's cloud, both can trim their Nvidia bills at once — a double-edged sword that also binds the two companies' infrastructure even more tightly.

Past parallels — wins and failures

OpenAI isn't the first to chase its own chip. The brightest success is Google's TPU. Starting around 2015, Google built its own tensor processors and ran Search, Translate, and eventually Gemini on its own silicon — sharply cutting Nvidia dependence and controlling inference cost. It proved how strong a "model company that builds its own chip" can be. Jalapeño is clearly walking this path.

Another is Amazon's Trainium and Inferentia. AWS offered cloud customers "a cheaper option than Nvidia" with its own training and inference chips. It never fully replaced Nvidia, but the "vacuum up cost-sensitive workloads with in-house silicon" strategy built meaningful share. Jalapeño will likely go the same way — swapping in starting with OpenAI's own internal workloads.

The shadow of failure is real too. Designing a chip isn't the finish line — you need a software ecosystem. Catching Nvidia's CUDA, built over more than a decade, is the true hard part. Many startups built "AI chip hardware" but collapsed on the software stack. OpenAI's edge: it isn't selling the chip externally — it's a closed environment where it only needs to run its own models well, sidestepping much of that trap.

Competitor counter-plays

The most directly hit is, naturally, Nvidia. OpenAI was one of its biggest GPU buyers; that buyer moving some inference volume to its own chip cracks the "AI = Nvidia" equation. Short term, Nvidia won't wobble — demand for top-end training GPUs is still explosive, and Jalapeño is inference-only. Nvidia's counter: an integrated platform that does inference well too, plus CUDA ecosystem lock-in.

Google sits in an odd spot. On one hand its TPU walked this path first, so it's vindicated; on the other, OpenAI catching up dilutes "custom silicon" as a differentiator. Google's counter: generational TPU lead (years of production experience) and the maturity of its vertical integration with Gemini.

Amazon, Meta, and Microsoft all run or are preparing their own chips. The trend hardens: "AI inference, each on its own silicon" — and the fight over who fabricates those chips quietly hands Broadcom and TSMC the biggest upside. The real winners of the chip war may be the ones selling the weapons.

So what actually changes

If you're a regular ChatGPT user, you'll feel almost nothing right now. But medium-term it's a good sign: lower inference cost gives OpenAI room to be more generous with free/cheap tiers or to serve heavier models for less. "Cheaper on my own chip" can come back as "more for users."

If you're an AI startup or developer, watch "the future of API pricing." If OpenAI trims infra cost with its own silicon, that's ammunition for an API price war. But conversely, if OpenAI pushes models optimized for its own chip, certain models and features could get more tightly bound to OpenAI's infrastructure. Watch both lock-in and price.

If you follow semis or infrastructure, this is another confirmation of the bigger "AI chip diversification" theme. The center of gravity is shifting from a Nvidia-only era to an ecosystem where Big Tech each tapes out its own chip and Broadcom and TSMC catch them. Just remember: with no benchmark numbers yet, "OpenAI replaced Nvidia" is way too early a conclusion.

One more layer — what "nine months" really means, and the variables left

To read this right, don't treat "nine months" as a mere speed brag. In silicon, time is money and risk. The usual two-to-three years exist because every design change forces re-verification, and a failure burns hundreds of millions. OpenAI compressing that to nine months isn't just "fast" — it's evidence that "a new chip-development methodology that accelerates the design-verification loop with AI" actually worked. This self-referential structure — designing your own chip with your own model — could become the standard process by which AI companies build chips.

Another easily missed context is the strategic weight of the "inference-only" choice. OpenAI didn't build a training chip. Training still needs Nvidia's top-end GPUs, and taking that on head-to-head would be reckless. Instead it seized the inference that bleeds cost daily, cutting cost at the surest spot and recycling those savings back into training and research — a virtuous loop. A textbook resource-allocation move: win the winnable fight first.

But the variables to weigh coldly are clear. One, yield — an engineering sample spinning in the lab and stable mass production sufficient to fill gigawatt data centers are entirely different difficulties. Two, software — however good the chip, if the compiler and runtime that run models efficiently on it don't keep up, performance won't materialize. OpenAI took this favorably by co-designing model and chip, but it's unproven at real operating scale.

Third, and the biggest picture, is leverage. Whether or not the in-house chip actually mass-produces, OpenAI merely holding the card "we have an alternative" creates leverage in price and volume negotiations with Nvidia. More than any single chip's performance, this shift in the negotiating dynamic may move more money long term. Jalapeño is a chip — and, at once, the most powerful message on the negotiating table.

🥄 Three Things You're Probably Wondering

— So is Nvidia finished? Not at all. Jalapeño is inference-only, and training — teaching models — is still Nvidia's kingdom. No performance numbers have been published either. This is "step one toward less dependence," not "replacement."

— Nine months sounds too fast. Does it actually work? Engineering samples are running GPT-5.3-Codex-Spark in the lab at target spec, so it's not a paper chip. But "works in the lab" and "stable mass deployment in gigawatt data centers" are different difficulties — the year-end rollout is the real test.

— Can I buy this chip? No. Jalapeño is something OpenAI built to run its own models on its own infrastructure, not a product for sale. You and I just use the ChatGPT running on top of it; we'll never touch the chip itself.

Further reading

Numbers and criteria are as of announcement and may change. Investment calls are yours to make!

관련 기사

무료 뉴스레터

AI 트렌드를 앞서가세요

매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.

매일 30개+ 소스 분석 · 한국어/영어 이중 언어광고 없음 · 1-클릭 해지