TOPGoogleMarvellTPU

Google + Marvell: Two New AI Chips to Rewire Inference

Q: Which companies or organizations are mentioned in this article?

The key entities covered in this article include Google, Marvell, TPU, AI Hardware, Broadcom.

Q: When was this article published?

This article was published on 2026-04-21 by spoonai.

Q: What is the original source of this article?

The original source is News.az (https://news.az/news/google-and-marvell-in-talks-to-co-develop-custom-ai-inference-chips).

Q: What are the main topics covered in this article?

This article covers: The TPU is splitting in two, Why this matters — and how we got here, Why Marvell, The wider picture: TPU as an external product, The updated AI chip landscape.

Alphabet is in talks with Marvell to co-develop two chips — a Memory Processing Unit (MPU) to pair with TPUs and a dedicated inference TPU. A quiet but decisive move to expand the TPU stack against Nvidia.

2026년 4월 21일 (화)·6분 소요·

News.az

AI accelerator chip design concept representing Google TPU and MPU — Source: Unsplash

The TPU is splitting in two

On April 19, 2026, reports surfaced that Alphabet is in discussions with Marvell Technology to jointly develop two custom AI chips. One is an MPU — a Memory Processing Unit designed to pair with TPUs. The other is a new TPU built specifically for inference. Marvell's stock jumped 6–7% pre-market; Google's slipped about 1%.

One-line version: Google is quietly going at Nvidia.

Why this matters — and how we got here

Remember what the TPU is

Google started building TPUs internally in 2015 to accelerate ML workloads in its search and ads systems. Today, TPUs are the backbone of Gemini training and inference. They're also sold externally — Anthropic, Apple, Salesforce, and others run production AI workloads on TPU through Google Cloud.

TPUs have historically been co-designed with Broadcom. Google defines architecture; Broadcom produces silicon. That partnership is the spine of Google's AI infrastructure.

Now Marvell joins as a second ASIC partner.

Why two new chips — MPU and inference TPU

This is where it gets interesting.

The first chip, the MPU, is designed to handle the largest bottleneck in LLM inference: memory bandwidth. LLMs constantly stream weights from memory, meaning more time is lost to memory I/O than to raw compute. The MPU offloads memory work from the TPU so the TPU can focus on math. Division of labor at the silicon level.

The second chip, a purpose-built inference TPU, splits off what older TPUs handled alongside training. Existing TPUs are general-purpose. The plan is to keep training on the current TPU line and ship a separate, cheaper, faster inference-only chip underneath it.

The resulting stack:

Stage	Today	New structure
Training	TPU v5p (general)	TPU v5p or successor training TPU
Inference	TPU v5e (lightweight general)	MPU + inference TPU combo
Memory I/O	Inside TPU	Dedicated MPU
Compute	Inside TPU	Dedicated inference TPU

If this stack ships and works, tokens-per-dollar and watts-per-token for inference drop meaningfully. Analysts are framing this as "the architecture that could reshape ASIC inference."

Why Marvell

Marvell is a 1995-founded U.S. semiconductor company. Core businesses: data center networking, storage controllers, and — critically — ASIC design services. The ASIC unit has ridden the AI wave hard in recent years. Marvell contributed heavily to AWS Trainium design. Now Google TPU joins the portfolio.

Google's rationale for adding Marvell breaks into three parts.

Supply chain diversification. A single-vendor relationship with Broadcom limits leverage on pricing and schedule. Bringing Marvell in opens room to push back.

Specialization. Marvell has particular strength in memory-controller design — which aligns well with the MPU concept.

Capacity. Marvell has reserved TSMC advanced-node (2nm, 3nm) slots that can absorb Google's volume growth.

One thing is clear: Google is ramping TPU volume to a level where it can seriously chase Nvidia's share. Marvell is one of the engines making that possible.

The wider picture: TPU as an external product

This deal isn't just about Google making better internal silicon. It's about Google's accelerating push to sell TPU capacity externally.

What already happened:

Anthropic ran Claude training and inference on TPU through 2024–2025, and has a separate ~$30B TPU commitment with Broadcom on top.
Apple runs parts of Apple Intelligence on TPU.
Salesforce and Character.AI are on TPU-based inference.

What's next:

Broadcom forecasts custom ASIC revenue growing about 45% in 2026, with a significant chunk from TPU.
Adding Marvell expands supply capacity and enables larger external sales.

When this plays out, the AI infrastructure market shifts from "Nvidia-dominant with cloud ASICs on the side" to a genuine multi-platform market: Nvidia + Google TPU + Cerebras + others.

TheNextWeb summarized this as "how Google is quietly planning to take on Nvidia."

The updated AI chip landscape

Here's the competitive map today:

Company	Primary chip	Design partner	Target
Nvidia	B200 / GB200	Internal	Training + inference, all
Google	TPU v5p / v5e / new MPU + inference TPU	Broadcom + Marvell (new)	Internal Gemini + external
AWS	Trainium 3 / Inferentia	Marvell, Alchip	Internal Bedrock + Anthropic
Microsoft	Azure Maia 100	GUC	Azure internal
Meta	MTIA	Internal + partners	Recommenders, ranking
Cerebras	WSE-3	Internal	Inference-only (OpenAI)
Huawei	Ascend 950PR	Internal	China domestic

Once Marvell is inside Google TPU, it becomes the shared ASIC backbone across AWS and Google — the common spine of the non-Nvidia camp.

What this means for you

If you use Google Cloud

If you're calling Gemini API or Vertex AI from production, expect meaningful inference price cuts over the next 6–12 months. Analysts estimate a 20–40% per-token cost reduction is plausible once the MPU + inference TPU combo ships in production.

If you're Nvidia-first in your architecture

Time to plan for multi-backend. Gemini, Claude, and GPT run on different hardware stacks, and those differences are starting to show up in pricing, latency, and availability. Avoiding vendor lock-in means keeping at least two model providers in your portfolio as a matter of course.

If you're an investor

Marvell's pre-market spike is more than a one-day reaction. If Marvell's ASIC revenue compounds across Google and AWS simultaneously, its AI-ASIC revenue share could rival or surpass Broadcom's by 2027–2028. The caveat: ASIC projects carry yield and timeline risk, so watch the early production ramp carefully.

For Alphabet, TPU external revenue is becoming a meaningful cloud growth driver. Broadcom already books multi-billion TPU-related revenue. Adding Marvell expands the total addressable pie and improves Google Cloud margin structure.

Google + Marvell: Two New AI Chips to Rewire Inference

The TPU is splitting in two