K
Koda Intelligence
exploreDeep Dive

Where Anthropic hid
Mythos.

Anthropic spent a year saying Claude Mythos was too dangerous to release. On April 7, the company gave it to twelve named partners under Project Glasswing. Apple, Google, Microsoft, JPMorgan. Within hours, a private Discord group was running the model anyway. They did not break in through a zero-day. They studied Anthropic's URL patterns from earlier model releases, cross-referenced data from a separate Mercor breach, and used an unrevoked contractor credential. Three minor signals, combined, walked them straight into the most restricted AI model of 2026. Bloomberg broke it on April 21. Anthropic confirmed the same afternoon. This is the company's third public security incident in 30 days.

8 MIN READ · BY THE KODA EDITORIAL TEAM · CYBERSECURITY · AI AGENTS · CONTAINMENT
headphones
LISTEN TO THE DEEP DIVE ~8 min listen
play_circle
WATCH THE 80-SECOND CINEMATIC SHORT 1:20
MODEL ACCESSEDMYTHOS↑ APR 7 · SAME-DAY BREACH PARTNERS APPROVED12↑ APPLE GOOG MSFT JPM ATTACK VECTORS3↑ URL · MERCOR · BADGE EXPLOITS USED0↓ NO ZERO-DAYS DETECTION WINDOW2 WEEKS↑ FOUND BY BLOOMBERG ANTHROPIC INCIDENTS · 30D3↑ MAR-26 · APR-1 · APR-22 MERCOR BREACH (UPSTREAM)$10B↑ TEAMPCP · LITELLM · MAR PUBLIC RESPONSEINVESTIGATING↑ STATEMENT VERBATIM MODEL ACCESSEDMYTHOS↑ APR 7 · SAME-DAY BREACH PARTNERS APPROVED12↑ APPLE GOOG MSFT JPM ATTACK VECTORS3↑ URL · MERCOR · BADGE EXPLOITS USED0↓ NO ZERO-DAYS DETECTION WINDOW2 WEEKS↑ FOUND BY BLOOMBERG ANTHROPIC INCIDENTS · 30D3↑ MAR-26 · APR-1 · APR-22 MERCOR BREACH (UPSTREAM)$10B↑ TEAMPCP · LITELLM · MAR PUBLIC RESPONSEINVESTIGATING↑ STATEMENT VERBATIM

Anthropic spent a year saying Claude Mythos was too dangerous to release. On April 7, the company gave it to twelve named partners under Project Glasswing. Apple, Google, Microsoft, JPMorgan. Within hours, a private Discord group was running the model anyway. They did not break in through a zero-day. They studied Anthropic's URL patterns from earlier model releases, cross-referenced data from a separate Mercor breach, and used an unrevoked contractor credential. Three minor signals, combined, walked them straight into the most restricted AI model of 2026. Bloomberg broke it on April 21. Anthropic confirmed the same afternoon. This is the company's third public security incident in 30 days.

Anthropic spent a year telling the world it had built an AI so dangerous it could not be released to the public. On April 7, 2026, the company gave Claude Mythos to twelve named partners under a quarantine program called Project Glasswing. The model was withheld from the API, withheld from Claude.ai, withheld from every public surface Anthropic owns.

Within hours, a private Discord group was running it. They did not break in through a zero-day. They did not phish anyone. They did not ship malware. They studied Anthropic's URL patterns from earlier model releases, cross-referenced data from a separate Mercor breach, and used an unrevoked credential held by one member who worked at an Anthropic vendor. Three minor signals, combined, walked them straight into the most restricted AI model of 2026.

Bloomberg broke the story on April 21. Anthropic confirmed the same afternoon: "We're investigating a report claiming unauthorised access to Claude Mythos Preview through one of our third-party vendor environments."

The model was not the weakest link. The containment was.

The Containment Gap

The Approval Gap is the missing checkpoint between an agent's analysis and its action. The Containment Gap is its companion failure: the missing checkpoint between a system you decided not to release and the people who can still reach it anyway.

Every restricted AI model has a containment surface. It maps to four questions: where is the model hosted, what identifies the host, who has access today, and what happens when "today" stops being today. The Mythos breach failed across all four. The host followed an identifiable pattern. The pattern was inferable from prior public releases. A vendor environment had a permanent contractor credential that nobody revoked. And nobody noticed for two weeks because the unauthorized group only started showing up in the logs once Bloomberg told Anthropic to look.

Three vectors, no exploits

Vector 1 - URL pattern guessing. Anthropic distributes preview models to partners through vendor environments that follow a naming convention. Anyone familiar with how Anthropic structured previous limited releases could work out the likely address of the Mythos preview from public history. The Discord group did exactly that. Pure inference, no novelty.

Vector 2 - The Mercor breach data. Mercor is a $10 billion startup that provides labeled training data and experts to Anthropic, OpenAI, and Meta. A separate supply-chain attack on Mercor in March 2026 (linked to a hacking group called TeamPCP, exploited via the LiteLLM library) leaked the kind of internal naming conventions and vendor IDs the Discord group needed to confirm their guesses. They did not have to brute-force the URL space. They had a head start.

Vector 3 - An unrevoked contractor credential. One Discord member worked at a third-party contractor that does work for Anthropic. Their vendor environment access was never properly scoped, never properly revoked, and survived whatever offboarding process Anthropic uses for terminated or rotated contractor accounts. Bloomberg's reporting cites this person as the one who "provided enough operational knowledge to confirm the guess was right."

That is the full attack chain. Three minor signals. No exploit, no malware, no phishing.

ANTHROPIC INCIDENTS · 30 DAYSFortune · LA Times · Bloomberg · AFP

Three failures in a month, all called "human error."

CMS misconfig leaked Mythos draft + 3,000 assets March 26 · Fortune found it
MAR 26
Claude Code source code leaked publicly April 1 · 1,900 files / 512K LOC
APR 1
Mythos accessed via 3-vector chain April 22 · Discord group · Bloomberg confirmed
APR 22
Anthropic's framing of all three Verbatim language
HUMAN ERROR
Mythos partners approved at launch Apple, Google, Microsoft, JPMorgan + 8
12
Discord access window before discovery April 7 to April 21 · found by Bloomberg, not Anthropic
2 WEEKS
The model wasn't the weakest link. The containment was.· THE CONTAINMENT GAP · KODA INTELLIGENCE · 23 APR 2026

Three signals every team running restricted AI should read

VECTOR 1
URL

URL inheritance is not a secret.

If your v1 lived at a guessable address, your v2 lives at a guessable address. Anthropic's preview environments followed a naming convention you could infer from public history alone. The cost of inference: a Friday afternoon for a determined researcher. The cost of pretending it wasn't: a Bloomberg article.

VECTOR 2
MERCOR

Your breach surface includes your suppliers.

The Discord group did not need to brute-force Anthropic. They had Mercor's leaked vendor IDs from a March supply-chain attack on the LiteLLM library. Every supplier you ship credentials to becomes part of your blast radius. If Mercor leaks, you leak. Treat suppliers as part of your containment perimeter or treat them as attackers.

VECTOR 3
BADGE

An unrevoked credential is a permanent vulnerability.

One Discord member held a contractor credential that survived past its purpose. That single artifact converted two pieces of inferential evidence into an actual breach. Most companies have dozens of these. Find them now or read about them in Bloomberg later.

The counter-argument · and why it doesn't hold

Anthropic's official statement says "there is no evidence that Anthropic's systems are impacted, nor that the reported activity extended beyond the third-party vendor environment." That language is honest in one sense and load-bearing in another.

It is honest because the breach genuinely did happen inside a vendor environment, not Anthropic's own production infrastructure. It is load-bearing in a way that should worry you if you are a CISO: the framing locates the failure entirely in the vendor and the contractor account, not in the system that allowed a guessable URL pattern to coexist with a permanent unrevoked credential. That is the same risk model that produced 13-hour AWS outages from Amazon's AI tools, deleted production databases at Replit, and now an outsider group running Mythos for two weeks. If your incident postmortem ends with "the contractor should have been offboarded," the next incident is already on the calendar.

What changes in the next 90 days

Three things to watch.

Disclosure expectations. Anthropic disclosed only after Bloomberg approached them. Regulators are starting to notice. Expect agency guidance to clarify that containment-class incidents (where a restricted model reaches an unauthorized environment) are reportable even when no production data was touched. Watch the FTC's AI Audit Procedures guidance and the EU AI Office's preview-environment rules, both expected in Q3 2026.

Vendor offboarding becomes a security primitive. Treating contractor offboarding as an HR task is the bug. Expect a wave of products that re-frame it as a security primitive: automated credential rotation tied to engagement-end events, audit trails as a default, scoped-access tokens with hard expiries instead of permanent badges. The first vendor to ship this as a managed service will own a category.

The "too dangerous to release" framing gets cheaper. Anthropic's Mythos rollout was the highest-profile use of the safety-first containment narrative in 2026. The Discord breach made that narrative easier to attack. Expect competing labs to either (a) move away from the framing entirely and lean on incremental capability releases, or (b) double down by pairing the language with public, third-party audited containment proofs. Watch which direction OpenAI, Google DeepMind, and Mistral pick at their next major release.

THE KODA DOJO · ACTIONABLE NEXT STEPS

Audit your containment surface this week.

  1. Map your URL inheritance. List every restricted endpoint, internal preview environment, and partner-only model URL. Group by naming convention. If a security researcher who knows v1 lived at thing-v1.vendor.example could guess v2, your URL is not a secret. Add a layer that does not depend on URL secrecy: signed paths, IP allowlists, per-partner mTLS.
  2. Pull the contractor and vendor offboarding ledger. Every account that has touched a restricted environment in the past 12 months. Engagement end date. Revocation date. Audit trail. If any of those three is blank, you have a Mythos-grade liability sitting in production right now.
  3. Monitor your supplier breach surface. Mercor's March breach leaked the data that made the Anthropic guess work. Subscribe to disclosure feeds for every supplier that has ever held credentials in your environment. When they breach, you breach.
  4. Hire someone to break your containment claim. If you tell regulators a model is too dangerous to release, that is a claim about a system, not a model. Pay an external red team to test it before someone tests it for free. The Discord group did Anthropic a favor by going to Bloomberg. The next group will not.
THE BOTTOM LINE

Build the locks before you build the legend.

If a model is genuinely too dangerous to release, a guessable URL plus an unrevoked badge is not enough containment. If a model is not actually too dangerous to release, the safety language was marketing. From the outside, both readings of the Mythos incident look identical, and the operational lesson is the same: do not trust containment claims that have not survived an external probe. The labs that win the next three years on the cybersecurity beat will not be the ones with the loudest safety language. They will be the ones whose containment claims have been independently audited and whose vendor offboarding is mechanically enforced rather than humanly remembered. Anthropic's containment was thinner than the public statement suggested. Yours probably is too.

Want this every morning?

AI analysis, world news, markets, and tools. One briefing, delivered free.

One email per day. No spam. Unsubscribe anytime.