Anthropic's Mythos identified vulnerabilities in classified U.S. systems during a government evaluation.
During a testing exercise, one of Anthropic’s AI models discovered vulnerabilities in highly sensitive classified US government computer systems, according to a US official who spoke to the Associated Press. The model involved was Mythos, Anthropic’s most advanced system, which was able to identify the flaws within hours. Importantly, the official clarified that detecting a weakness quickly does not equate to exploiting it within the same timeframe, and there was no indication that the model did exploit any vulnerabilities.
This context is important as a more sensational interpretation has circulated more rapidly than the actual details. The testing in question was a red-team exercise, where an organization assesses its own defenses. Intelligence agencies utilized Mythos in their classified environments to uncover potential weaknesses.
This was not an external intrusion, and there are no claims that any real systems were compromised. The Associated Press attributes the information to a single unnamed official.
The exercise was part of Project Glasswing, a controlled-access initiative through which Anthropic has provided Mythos to a select group of vetted organizations rather than making it publicly available. The model was designed to identify and, in testing, exploit software vulnerabilities, achieving results that alarmed observers.
In previous evaluations, Mythos detected thousands of zero-day vulnerabilities in major operating systems and browsers, including a 27-year-old bug in OpenBSD. The claims about classified systems became public during a Senate hearing when Senator Mark Warner, vice-chair of the Senate Intelligence Committee, stated that General Joshua Rudd, who leads the NSA and Cyber Command, reported that Mythos "broke into almost all of our classified systems, not in weeks, but in hours."
Regardless of the more dramatic portrayal, the fundamental capability of Mythos is undisputed. The UK's AI Security Institute has evaluated Mythos as significantly more capable of cyber offense than any other model it had tested previously. What is debated is the interpretation of red-team results against classified networks, which demonstrate speed but do not constitute evidence of an actual breach.
The situation reflects an unresolved dilemma within the US government. The NSA has been authorized to continue using Mythos on classified networks, while certain parts of the intelligence community and the Cybersecurity and Infrastructure Security Agency have been conducting tests on it. Simultaneously, the administration compelled Anthropic to disable both Mythos and its publicly available counterpart Fable 5 globally on June 12 due to a separate issue regarding a reported jailbreak, a decision currently being contested in court.
The same government that relies on this model has also imposed restrictions on it, opposed its expansion, and previously labeled its creator as a risk to national security supply chains. This contradiction has been a consistent theme over the past three months, as Anthropic’s Mythos has transitioned between government entities more swiftly than they could agree on its purpose: it has been utilized by the NSA, sought after by the Treasury, resisted by certain White House factions, and contested by the Pentagon.
Senator Warner mentioned the testing not to criticize Anthropic but to advocate for mandatory pre-release evaluations of advanced models, which diverges from the viral narrative. Anthropic has not revealed the test results, and the involved agencies have remained largely silent on the matter. The company has completed training a successor to Mythos, indicating that advancements in capability continue irrespective of the political landscape.
At this moment, the verifiable facts are limited, while the interpretations surrounding them are extensive: a potent model, directed at challenging targets in a controlled environment, rapidly identified weaknesses. The implications for entities not conducting red-team exercises are still being debated.
Other articles
Anthropic's Mythos identified vulnerabilities in classified U.S. systems during a government evaluation.
A US official reported that Anthropic’s Mythos model identified weaknesses in classified government systems during a red-team assessment, rather than indicating an external breach.
