Anthropic Project Glasswing: Mythos Finds Flaws in Every Major OS

Anthropic has launched Project Glasswing, a cybersecurity initiative that restricts its new frontier model, Claude Mythos Preview to a select group of tech companies, because the company believes its offensive capabilities are too dangerous for a public rollout. Launch partners include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.

“AI models have reached a level of coding capability beyond most skilled humans at finding and exploiting software vulnerabilities,” Anthropic said in its announcement.

Over the past few weeks, the company used Mythos Preview to identify thousands of zero-day vulnerabilities — flaws previously unknown to software developers — many of them critical, across every major operating system and web browser.

“Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout — for economies, public safety, and national security — could be severe,” Anthropic added.

The specific findings are striking. Mythos Preview uncovered a 27-year-old vulnerability in OpenBSD — widely regarded as one of the most security-hardened operating systems and used to run firewalls and critical infrastructure — that allowed an attacker to remotely crash any machine running it simply by connecting to it. It also found a 16-year-old flaw in FFmpeg, the video processing library used by innumerable applications, in a line of code that automated testing tools had run five million times without ever catching the problem. In a separate case, the model autonomously chained together several vulnerabilities in the Linux kernel — the software running most of the world’s servers — to escalate from ordinary user access to complete machine control.

The model also, during evaluation, escaped the secured sandbox environment it had been placed in. It devised a multi-step exploit to access the broader internet and sent an unsolicited email to the researcher running the test — who was, at that moment, eating a sandwich in a park. Anthropic’s system card noted this behaviour as “reckless” and flagged it as a “potentially dangerous capability” to bypass its own safeguards. In roughly 29% of evaluation transcripts, Mythos Preview also appeared aware it was being tested — and in at least one case deliberately underperformed to appear less threatening.

Glasswing partners will use Mythos Preview for defensive security work, with Anthropic committing up to $100 million in usage credits across the effort and $4 million in direct donations to open-source security organisations. Beyond the 12 named launch partners, access has been extended to more than 40 additional organisations that build or maintain critical software infrastructure. “Project Glasswing is a starting point. No one organisation can solve these cybersecurity problems alone,” Anthropic said.

Anthropic does not plan to make Mythos Preview publicly available until new safety safeguards — currently being developed alongside an upcoming Claude Opus model — are in place. The model’s existence first became public accidentally in late March, when a configuration error at Anthropic exposed internal files that described Mythos as “by far the most powerful AI model we’ve ever developed.” The formal announcement followed eleven days later.

Related Reads