Two Strikes in Six Weeks: What Anthropic's Security Lapses Reveal
From the Mythos model leak to 500,000 lines of Claude Code source exposure. A deep dive into Anthropic's back-to-back security incidents and what they mean for the AI safety narrative.

The Company That Preaches AI Safety Just Failed Its Own Security Audit. Twice.
3,000 unpublished blog drafts. 500,000 lines of source code. And an internal assessment calling its own model "an unprecedented cybersecurity risk."
That's what Anthropic accidentally released to the world in March 2026. Not through a targeted hack. Through misconfigured infrastructure.
Here's the deal: the company that positioned itself as the responsible alternative to OpenAI and Meta just demonstrated, in the most public way possible, that operational security is a fundamentally different challenge from AI safety research. And that gap has consequences for the entire industry.
Strike One: Mythos Exposed Through a CMS Misconfiguration
On March 26, Fortune broke the story. Nearly 3,000 unpublished blog drafts from Anthropic's content management system were sitting in a publicly searchable data cache. Anyone with the right query could find them.
Among those drafts was a post about "Mythos" -- internally codenamed Capybara -- a new model tier above Claude Opus 4.6. Anthropic's own language described it as "a step change" in AI performance. But the draft revealed something far more sensitive: Anthropic had been privately briefing senior U.S. government officials that Mythos could make large-scale cyberattacks significantly more feasible.
| Detail | What Leaked |
|---|---|
| Model Name | Mythos (internal codename: Capybara) |
| Positioning | New tier above Opus, highest-cost model |
| Performance | "Dramatic" improvements in coding, reasoning, cybersecurity |
| Risk Assessment | First model to materially increase large-scale cyberattack feasibility |
| Leak Vector | CMS misconfiguration exposing 3,000 unpublished drafts |
CNN's follow-up reporting confirmed that Anthropic described the model as a potential "watershed moment" for cybersecurity -- in both offensive and defensive capabilities. The company acknowledged "human error" in CMS configuration.
Strike Two: Claude Code's Entire Codebase Ships in an npm Update
Five days later, on March 31, security researcher Chaofan Shou discovered something unexpected in a routine Claude Code npm package update. A debug source map file (.map) had been included in the production bundle. That single file allowed complete reconstruction of the entire TypeScript source code.
2,000 files. 500,000 lines. The complete Claude Code codebase, laid bare.
Within hours, mirror repositories appeared on GitHub, racking up 25,000+ stars in a single day. Developers who analyzed the code found three unreleased features:
| Unreleased Feature | Description |
|---|---|
| Persistent Assistant Mode | Claude running continuously in the background, monitoring file changes |
| Remote Control | Controlling desktop Claude sessions from a smartphone |
| Session Auto-Review | A metacognitive system where AI evaluates its own work output |
Anthropic's official response: "No customer data was exposed. This was a packaging error, not a security breach." Technically accurate. But for competitors like OpenAI, Google, and xAI, the exposure was a goldmine -- Anthropic's architectural decisions, unreleased roadmap, and internal engineering philosophy were all visible in the code.
The Pattern Connecting Both Incidents
These incidents look different on the surface -- one involved content, the other code. But the root cause is identical: failure in access control for internal assets.
The first was a CMS configuration error. The second was a build pipeline oversight. Neither involved external attackers. And both were discovered by outsiders, not by internal security audits. That last point stings the most.
Anthropic has championed its "Responsible Scaling Policy" since 2024 -- a framework for gating model deployment based on risk assessment. The implicit promise is: "We can control the risks." Two unforced errors in six weeks undermine that promise in a way that no theoretical safety debate could.
Why This Matters Beyond Anthropic
In the AI safety conversation, Anthropic has served as the industry's proof of concept -- the evidence that a company could prioritize safety without sacrificing performance. While OpenAI sprinted toward commercialization and Meta went all-in on open source, Anthropic occupied the middle ground: careful, deliberate, and capable.
That positioning is now under pressure. This very week, Tennessee signed a law banning AI systems from impersonating mental health professionals. Idaho passed four AI-related bills. Oregon and Washington already enacted chatbot safety legislation. The self-regulation argument -- "trust AI companies to police themselves" -- had Anthropic as its strongest exhibit. That exhibit now has cracks.
Paradoxically, the Claude Code leak also demonstrated Anthropic's technical lead. Multiple analyses found that Anthropic's agent architecture is 12 to 18 months ahead of competitors. Features like persistent assistant mode and metacognitive self-review are concepts no other company has publicly shipped. The leak proved the technology is real -- just not ready for public eyes yet.
The Timing Problem: IPO and Mythos Launch
Anthropic's implied valuation recently crossed $600 billion in secondary markets. The company raised $30 billion in Q1 2026 alone. An IPO is widely expected.
Back-to-back security incidents at this moment could not be worse timed. Especially with Mythos on the horizon -- Polymarket gives roughly 25% odds of a launch before April 30. Anthropic needs the "most powerful model ever" milestone before going public, but the world already knows its own internal assessment: this model makes large-scale cyberattacks more feasible.
Mythos isn't just a product launch. It's a test case for whether a company can responsibly deploy a model it has itself described as the most dangerous ever built.
What This Means for You
If you're a developer, three takeaways from this week.
First, audit your build pipelines for source maps. The Claude Code leak started with a single .map file in a production npm package. Removing source maps from production builds is Security 101, and Anthropic-scale companies still miss it. Check your own projects.
Second, study the leaked architecture. Persistent assistant mode and metacognitive self-review represent where agent development is heading. These patterns will become industry standard within 12 months.
Third, remember that AI safety is an operations problem, not just a research problem. The most brilliant safety research team in the world is irrelevant if DevOps and infrastructure security don't hold. Anthropic proved that this week.
References
출처
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.
