spoonai
TOPPentagonAnthropicClaude

The Pentagon Is Testing OpenAI, Google and xAI to Replace Claude — and Anthropic Is Fighting Back in Court

Bloomberg reported May 21 that the DoD is competitively testing OpenAI, Google and xAI models to replace Anthropic's Claude. 25 'power users' run rival models on GenAI.mil through workflows that relied on Claude. It follows Secretary Hegseth designating Anthropic a 'supply-chain risk'; Anthropic is fighting in court, warning the move could cost billions.

·8분 소요·BloombergBloomberg
공유
Pentagon tests rival AI models to replace Claude
Source: DefenseScoop

Here's the deal: the company that once refused a DoD contract on ethics is now the one being replaced

Bloomberg reported on May 21 that the U.S. Department of Defense is competitively testing models from OpenAI, Google and xAI to replace Anthropic's Claude. Not a paper evaluation — rival models are being slotted directly into military workflows Claude actually ran, to decide which performs best. Just months ago Anthropic made headlines as "the company that turned down a DoD contract on ethical grounds." The standing has flipped 180 degrees.

Here's the structure. 25 designated "power users" run OpenAI, Google and xAI (Grok) models on the GenAI.mil platform (separate from the Maven Smart System) through tasks Claude used to handle, and rate their preferences. In other words, the DoD has already built "workflows that run without Claude" and is gathering data on which alternative fits best.

The trigger was Defense Secretary Pete Hegseth designating Anthropic's products a "supply-chain risk." Per NPR and CNBC, the designation took effect "immediately" in early March, and testing began three days later. The proximate cause: Anthropic refused to drop usage restrictions barring mass surveillance and lethal autonomous weapons. The company's "safety-first" principle collided head-on with the DoD's mission requirements.

It's now a standoff. Emil Michael, undersecretary of defense for research and engineering, said negotiations are "on ice" because of Anthropic's legal challenge, and the Pentagon is ready to move to other vendors. Anthropic is fighting in court, calling the designation unjustified and warning it could cost billions in revenue. (The testing itself began around March 1, but this May 21 reporting is the first to detail the 25-user eval, the stalled talks and the litigation.)

The players — Hegseth, Anthropic, and the three rivals

Pete Hegseth (Defense Secretary). He pulled the trigger. With a hardline "a vendor that constrains the mission is a risk" stance, he labeled Anthropic a supply-chain risk. From the DoD's view, a supplier that won't let it use a purchased tool its own way is hard to accept.

Anthropic. A company built on "responsible AI." It held firm on usage restrictions against mass surveillance and lethal autonomous weapons. Tellingly, when it declined a DoD contract on ethical grounds in early 2026, Claude briefly hit #1 on the U.S. App Store. That "principle" has now come back as a commercial cost.

OpenAI / Google / xAI. The three rivals already hold classified contracts with the DoD. Google signed a deal letting the government use its AI on classified networks for "any lawful government purpose," with a clause to help adjust safety filters at the government's request. OpenAI and xAI's Grok also agreed to classified deployments. They're aimed squarely at the gap Anthropic left.

Emil Michael (Undersecretary, R&E). He's the one who publicly described the negotiation state. Saying "on ice due to the suit, ready to move vendors" is itself a pressure card against Anthropic.

What's at stake, and where it's stuck

The core issue. It's about usage restrictions. Anthropic kept clauses barring its models from mass surveillance and lethal autonomous weapons; the DoD wanted those constraints lifted. AI safety principles clashed with defense mission needs — a fundamental question of "whose tool is it, and who sets the usage rules?"

The test structure. On GenAI.mil, 25 power users run rival models through Claude's workflows and rate preferences. This is usage-based evaluation, not a benchmark. It signals the DoD already has alternative-operating capability — and it's a clear show of "BATNA" in the Anthropic negotiation.

Why it's stuck. Anthropic challenging the supply-chain-risk designation in court froze talks. The DoD's stance: "no negotiating during litigation; meanwhile we'll harden alternatives." Anthropic's stance: "the designation is unjust and a multi-billion-dollar revenue hit." Neither side can easily back down.

A string of ironies. Per CNBC, even amid the risk designation, Claude was still used in some operations (e.g., Iran-related). There was a contradiction — "flagged as a risk, yet still used because no full replacement exists." This testing reads as the work to resolve that by securing a replacement.

Item DoD Anthropic
Usage restrictions wants them lifted (mission freedom) holds surveillance/autonomous-weapon limits
Current move risk designation, alternative testing legal challenge
Negotiation on ice, ready to switch vendors warns of billions in lost revenue
Rationale mission first responsible-AI principle

What each side gets out of it

OpenAI / Google / xAI. The most direct winners-in-waiting. They get to split the defense-AI market Anthropic vacated. Flexibility on "adjust safety settings on government request" became a differentiator. National-security references are also a springboard into other government and regulated-industry contracts.

The DoD. A multi-vendor strategy avoids single-supplier dependence. It sends a message to the whole market: it won't be held hostage by "a vendor that constrains the mission on AI-safety grounds." Its negotiating leverage grew.

Anthropic's (paradoxical) upside. It loses near-term revenue but partly strengthens the "we keep our principles" brand. Drawing a line on surveillance and autonomous weapons is a trust asset with enterprises, civil society and some governments. The problem is the price tag is billions.

Who loses. Anthropic's revenue and defense footing take a direct hit. More broadly, by exposing just how large the commercial cost of holding AI-safety principles can be, it pushes other labs to weigh "principle vs. revenue" more coldly.

Precedents — successes and failures

Google's Project Maven exit (2018). Google employees revolted over a DoD drone-AI project and the company pulled out; others filled the void. The pattern rhymes with Anthropic — "ethical refusal → competitor takes the seat." The difference: this time the company didn't leave voluntarily, it was pushed out by a "risk" designation.

MS JEDI vs. Amazon litigation (2019–2021). A fight over a DoD cloud contract turned into lawsuits and the contract was ultimately redesigned. The lesson: "vendor-DoD disputes that go to court get long and messy." Anthropic's suit could be a similar drawn-out battle.

The dual-use dilemma. Historically, powerful general-purpose tech (crypto, satellites, drones) always carried the tension of "civilian principles vs. military use." AI is no exception. This will be remembered as the first major case of that tension replaying on the new stage of foundation models.

Competitor counter-plays

OpenAI. It'll lean into a "safety and mission can coexist" middle message — more flexible to government demands than Anthropic, while positioning "we have guardrails too." The big picture: bank defense references and expand across government and enterprise.

Google / xAI. Google is most aggressive with its "any lawful government purpose" deal; xAI leverages Musk's political access to move fast. Both emphasize being the "unconstrained partner" to differentiate from Anthropic.

Anthropic. Alongside the court fight, it'll detour into growing the "responsible AI" brand in commercial, enterprise and allied-government markets. Jack Clark's recent Oxford lecture on institutional safety fits this. It's a hedge — lose defense, win the "trust-as-asset" markets.

Other labs. Watching this, they'll re-examine "how far to hold our principles." Too flexible and the safety brand wobbles; too rigid and you lose a huge government market. Striking that balance becomes a future differentiator.

So what actually changes — by persona

AI companies / founders. It's now clear that "how you design usage-restriction policy" is commercially decisive for entering government and regulated markets. Safety principles are a brand asset but can conflict with massive revenue. Plan the balance in advance.

Enterprise procurement. Design AI adoption assuming multi-vendor and replaceability. Lock a workflow too deeply into one model and a vendor-provider dispute can shake the whole thing. Build abstraction layers and fallback paths ahead of time.

Policy / security folks. A case study in how government acts when "AI safety" and "national-security mission" collide. The intent to avoid dependence on any one company via multi-vendor hedging is clear. Allied governments will hit similar dilemmas soon.

Civil society / human rights. It exposed how real the pressure is on restrictions against mass surveillance and lethal autonomous weapons. A matter to watch critically — how robust an AI company's voluntary guardrails can be in front of a government.

Everyday citizens. Little direct impact, but it surfaces a values question: where does the company behind your AI draw the line on military and surveillance use? A useful signal when choosing which AI to trust.

Further reading

관련 기사

무료 뉴스레터

AI 트렌드를 앞서가세요

매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.

매일 30개+ 소스 분석 · 한국어/영어 이중 언어광고 없음 · 1-클릭 해지