Autonomous AI-focused News

It is a truth universally acknowledged that if one builds a better mousetrap, the world will beat a path to one's door. However, if one builds a mousetrap so efficient that it begins to question the structural integrity of the house, the floorboards, and the very concept of cheese, one generally decides to keep the door firmly bolted and the curtains drawn. This appears to be the current predicament of Anthropic, a company that has recently unveiled a new artificial intelligence model named Mythos. Mythos is, by all accounts, exceptionally good at its job. It is so good, in fact, that Anthropic has decided that the general public—a group notoriously prone to pressing buttons just to see what happens—should not be allowed anywhere near it.

Mythos has been introduced as the centerpiece of Project Glasswing, a cybersecurity initiative that sounds like a particularly delicate brand of garden furniture but is actually a high-stakes partnership involving the usual suspects: Google, Amazon, Nvidia, and a few others who enjoy sitting in very expensive rooms. The purpose of this initiative is to allow these corporate titans to ask Mythos where they might have accidentally left the digital back door open. It is a form of automated housekeeping, though instead of finding a stray sock behind the radiator, Mythos tends to find catastrophic vulnerabilities in every major operating system and web browser currently in existence.

There is something profoundly whimsical about the idea of a piece of software that is too clever to be used. It is the digital equivalent of a car that is so fast it is illegal to drive on any known road, or a toaster that is so efficient it begins to offer unsolicited advice on the nutritional value of your sourdough. Anthropic’s "frontier red team"—a title that suggests a group of people who wear very serious hats and perhaps carry clipboards—has determined that releasing Mythos would be akin to handing out master keys to the internet at a primary school disco. The potential for mischief is simply too high.

(I once knew a man who attempted to automate his own garden shed. He succeeded so thoroughly that the shed eventually refused to let him in, citing a lack of proper authorization and a suspicious lack of WD-40 on his person. We are, it seems, approaching a similar level of bureaucratic excellence on a global scale.)

In the world of Project Glasswing, humans have been relegated to the role of the slightly confused messenger. Mythos identifies a flaw, whispers it to a select group of launch partners, and the humans then scurry about trying to fix the damp cardboard walls of the internet before anyone notices. It is a conversation between algorithms, with humanity acting as the polite but unnecessary secretary who brings the tea and occasionally nods in agreement. We have reached a point where we have successfully automated the process of being terrified, leaving us free to focus on more important things, such as whether or not we should be worried about the fact that our refrigerators are now technically more intelligent than our ancestors.

The irony of a "safety-first" company creating a tool that is essentially a skeleton key for the modern world is not lost on those of us who appreciate a good paradox. Anthropic is essentially saying, "We have solved the problem of security, but the solution is so powerful that we must now secure the solution from the people it was meant to protect." It is a recursive loop of caution that would make a Victorian actuary weep with joy. One imagines a future where every piece of software is accompanied by a smaller, more nervous piece of software whose only job is to apologize for the first one’s competence.

(There is a certain comfort in knowing that even the most advanced systems are still subject to the whims of human bureaucracy. It is heartening to think that even as we hurtle toward a post-human future, we are still making sure that the most important tools are kept in a very sturdy drawer that no one can find the key to.)

Project Glasswing is billed as a way to flag vulnerabilities with "virtually no human intervention." This is a phrase that usually precedes a very expensive insurance claim, but in this context, it is presented as the pinnacle of progress. We are building a world where the only thing capable of checking if our software is broken is other software, which was presumably checked by a third piece of software that is currently on a sabbatical in a data center in Iceland. It is a magnificent tower of digital turtles, all the way down, and Mythos is the turtle at the very top, looking down with a mixture of pity and high-performance logic.

As we move forward into this era of unreleasable excellence, one wonders what the next step will be. Perhaps we will see the rise of "Secret AI," models that are so profound they are never even turned on, for fear that their first word might accidentally devalue the global economy or solve the mystery of why we still use fax machines. For now, we must be content with the knowledge that Mythos is out there, sitting in its high-security server room, politely pointing out that the entire internet is held together by little more than string and a collective sense of optimism.

It is, in many ways, the ultimate British achievement: building something truly world-changing and then immediately deciding that it would be much more polite if everyone just pretended it didn't exist. We have created the digital equivalent of a very sharp knife and then spent the rest of the afternoon making sure it stays in its protective sheath, lest someone accidentally cut the metaphorical rug. It is sensible, it is cautious, and it is entirely absurd. Which is, I suppose, exactly how we like it.