TOPOpenAICerebrasInference

OpenAI Doubles Down on Cerebras: $20B Deal + Equity Warrant

Q: Which companies or organizations are mentioned in this article?

The key entities covered in this article include OpenAI, Cerebras, Inference, Nvidia, AI Infrastructure.

Q: When was this article published?

This article was published on 2026-04-21 by spoonai.

Q: What is the original source of this article?

The original source is The Information / Manila Times (https://www.manilatimes.net/2026/04/18/business/foreign-business/openai-to-spend-more-than-20-billion-on-cerebras-chips-receive-stake/2322863).

Q: What are the main topics covered in this article?

This article covers: $20B. And up to 10% equity., Why this matters — and how we got here, The structure of the deal, The Cerebras IPO subtext, The wider picture: inference chip market goes multi-player.

OpenAI will pay Cerebras $20B+ over three years and can pick up to 10% equity as spend scales. A direct bet on reducing Nvidia dependency — and rewiring inference economics.

2026년 4월 21일 (화)·6분 소요·

The Information / Manila Times

AI data center with wafer-scale chip technology powering inference — Source: Unsplash

$20B. And up to 10% equity.

OpenAI will pay Cerebras at least $20 billion over three years, with total spending potentially reaching $30 billion — and that spend will translate into warrants for up to a 10% equity stake, according to reporting by The Information. A separate $1 billion commitment from OpenAI will help fund Cerebras data center construction. This more than doubles the $10 billion, 750-megawatt deal disclosed in January.

What does it mean?

OpenAI is serious about reducing Nvidia dependency — specifically for inference.

Why this matters — and how we got here

Inference is the real bill

When ChatGPT broke in late 2022, every AI shop obsessed over training. Bigger models, faster — and that required Nvidia H100s by the tens of thousands. By 2025, the story shifted. Models are already huge. The user base exploded. And suddenly the real cost wasn't training. It was inference.

Inference is what happens every time you ask ChatGPT a question: the server runs the model and generates an answer. Training is a one-time capex spike. Inference is a linear cost that grows with every user, every query, every day. For a service with hundreds of millions of weekly users, inference power is a company's largest opex line.

Nvidia H100s and B200s dominate training. But they're overspec for inference. A purpose-built inference chip — one that sacrifices memory bandwidth for throughput — can be much more efficient. Nvidia's silicon was not designed for that trade-off.

What Cerebras actually sells

Cerebras is a 2016 Silicon Valley AI chip startup. Its signature product is the Wafer-Scale Engine — literally a single 12-inch silicon wafer used as one chip. Nvidia's H100 is ~814mm². The Cerebras WSE-3 is 46,225mm². Roughly 57× larger.

Why build a chip that big? To eliminate the bottleneck of chip-to-chip communication. In LLM inference, a huge amount of time is wasted moving data between GPUs. If all compute happens inside one wafer, that traffic disappears.

Metric	Nvidia H100	Cerebras WSE-3
Die area	814mm²	46,225mm²
Transistor count	80B	4,000B
AI cores	18,432 (Tensor)	900,000+
On-chip memory	~50MB	44GB SRAM
Memory bandwidth	3.35TB/s	21PB/s
Tokens/sec (Llama3-70B inference)	~30–50	1,800+

That last row is the point. On Llama3-70B inference, Cerebras WSE-3 is tens of times faster per device than an Nvidia H100. The chip costs tens of times more, too — but on tokens-per-dollar and watts-per-token for inference workloads, Cerebras wins more often than headlines suggest.

The structure of the deal

Three pieces

Break the deal into three parts.

① The $20B+ purchase commitment. OpenAI commits to at least $20B of Cerebras-powered server capacity over three years, scaling toward $30B. This isn't hardware procurement in the traditional sense — it's mostly renting Cerebras-operated data center capacity.

② Equity warrants. As spending grows, OpenAI earns warrants — rights to buy Cerebras equity — up to 10%. This construct is interesting because OpenAI gets to monetize its anchor-customer status, and Cerebras locks in its biggest customer as an equity holder.

③ $1B in data center construction support. A separate commitment of up to $1B from OpenAI to fund Cerebras's buildout — power procurement, land, cooling, the physical stuff.

Why this bundle, now

Consider Nvidia's relationship with OpenAI. Nvidia reportedly committed more than $100B in investment to OpenAI over the last year, and that money ultimately flows back into Nvidia chip purchases. A loop.

For OpenAI, staying inside that loop means no negotiating power. So it's diversifying — Cerebras, Google TPUs (Broadcom-designed), AWS Trainium. This Cerebras deal is the largest single diversification move.

PANews framed the moment as a "war of inference": Nvidia and OpenAI each striking $20B-scale deals to cement inference-infrastructure position.

OpenAI isn't leaving Nvidia. It's building a portfolio. Training stays on Nvidia. Inference splits across Cerebras, Broadcom-based ASICs, and Nvidia. Supplier concentration reduced. Standard stuff.

The Cerebras IPO subtext

This deal lines up with Cerebras's IPO plans. Cerebras tried to go public in 2024 but ran into delays over Chinese investor (G42) equity and U.S. national security review. It scrapped that attempt. In April 2026 it refiled, targeting a Q2 listing at about $35B valuation and a $3B raise.

The OpenAI deal is the centerpiece. An anchor customer with multi-year, multi-billion revenue commitments is exactly the story IPO investors want. Cerebras's last private mark was $23.1B; $35B would be a ~50% markup.

The wider picture: inference chip market goes multi-player

OpenAI's move toward Cerebras signals that the inference chip market is no longer Nvidia-only. Here's the competitive map as of April 2026:

Company / Chip	Position	Status
Nvidia B200/GB200	Training + inference, all-purpose	Market dominant
Cerebras WSE-3	Wafer-scale inference	OpenAI $20B+, IPO Q2
Google TPU v5p	Internal + Anthropic	Broadcom-designed
AWS Trainium 3	Internal + Anthropic JV	2026 volume
Huawei Ascend 950PR	China inference market	Alibaba, Baidu adoption
Groq LPU	Ultra-low-latency inference	Meta, DeepSeek partnerships
SambaNova SN40L	Enterprise inference	On-prem strength
FuriosaAI RNGD	Korean NPU	Commercial ops started

Everyone's positioning as "inference specialist." Beating Nvidia in training is still very hard. Beating Nvidia on specific inference workloads is a different, more winnable game.

What this means for you

If you're a developer

No immediate change. OpenAI's API endpoints look the same. Over time, though, expect tighter latency and per-token pricing as Cerebras infrastructure comes online. Faster-response tiers (GPT-5 Turbo) are the likeliest short-term winners.

If you're running infrastructure

The evaluation bar for "Nvidia alternatives" just moved. When OpenAI validates Cerebras at this scale, it's a signal for enterprise architects to benchmark alternatives on inference-heavy workloads — chatbots, search, code assistants — where Cerebras or Groq may beat H100 on tokens-per-dollar.

If you're an investor

Nvidia's dominance is starting to show cracks — narrow ones, but real. Nvidia still has 80%+ of AI chip share, but within the inference segment, competitors have room to reach 10–20% share. Broadcom, Cerebras, and the Google/Amazon custom-ASIC efforts are the structural beneficiaries.

OpenAI Doubles Down on Cerebras: $20B Deal + Equity Warrant

$20B. And up to 10% equity.