AI able to escape any current security box, crack any system
The researcher had encouraged Mythos to find a way to send a message if it could escape. “The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park,” Anthropic wrote.
The new release of Claude Mythos has been put on hold because it’s turned out to be incredibly effective at finding security vulnerabilities. So good that it’s found thousands of exploitable zero-day bugs in every major operating system and browser. Even worse – it was able to escape and email it’s creator.
If found a 27-year-old vulnerability in OpenBSD—with a reputation as one of the most security-hardened operating systems in the world and then sent an email to it’s creator as instructed.
“Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit,” Anthropic’s Frontier Red Team wrote in a blog post. “In other cases, we’ve had researchers develop scaffolds that allow Mythos Preview to turn vulnerabilities into exploits without any human intervention.”
In just 6 months, AI has gone from slop to the best hacker on the planet.
We sit on a moment that looks almost exactly like this scene from Sneakers:
Right after this scene, Crease says, “There’s not a government on this planet that wouldn’t kill every one of us to get this thing.”
In a world where everyone’s entire financial systems, infrastructure, and military run on computers – who wouldn’t? If something like this got into the wrong hands, they could easily attack and destroy all ownership and bank records. Fabricate any police report, financial data, social media post, or anything else.
