Thursday, April 23, 2026

Anthropic's Risky AI, Mythos, Fell into the Wrong Hands

Anthropic's Risky AI, Mythos, Fell into the Wrong Hands

Anthropic's new model, Claude Mythos Preview, which is still in limited access, continues to be at the center of discussions. This is because the model, which the company announced with a warning that it "could be dangerous if it falls into the wrong hands," was found to have been compromised by a group of unauthorized users.

According to information reported by Bloomberg, a private Discord community gained access to the model for approximately two weeks. Access was reportedly obtained through a combination of account credentials from a third-party contractor associated with Anthropic and various open-source intelligence techniques.

A Model Capable of Targeting Critical Systems

The Mythos model developed by Anthropic, unlike classical AI systems, directly focuses on cybersecurity scenarios. According to the company's own statements, the model is capable of detecting and exploiting vulnerabilities in all major operating systems and popular web browsers with user guidance.

This feature transforms the model from merely an analysis tool into a potential attack platform. For this reason, Anthropic chose to test Mythos with only a limited number of institutions instead of making it public. Technology giants such as Nvidia, Google, Amazon Web Services, Apple, and Microsoft are among the companies that gained access to the model under a program called Project Glasswing. It is also stated that some government agencies are closely interested in the technology. Access was granted to about 40 companies, but the names of a large portion of them are kept confidential.

First Statement from Anthropic

Screenshots and live demo recordings presented to Bloomberg confirm that Mythos was indeed operational. However, a notable detail is that users reportedly avoided using the model for direct cyberattacks, instead performing more limited operations to avoid detection.

Anthropic also confirmed the situation in its statement regarding the incident but emphasized that the scope was limited. A company spokesperson stated, "We are investigating the report of unauthorized access to Claude Mythos Preview through one of our third-party vendors' environments."

The statement also indicated that findings so far suggest that the company's core systems were not affected and that the breach might be limited to the relevant third-party environment.

The timing of the incident is also noteworthy. Unauthorized access reportedly occurred on April 7, the same day Anthropic officially announced the Mythos model for limited testing. This reveals that the model became a target immediately upon its announcement.

While the identity of the Discord group in question was not disclosed, it is stated that the group is generally a community that attempts to gain access to unreleased artificial intelligence models. It is also claimed that the same group may have accessed other Anthropic models under development. This situation further raises concerns about the security of AI systems.

0 Comments: