• itscybernews
  • Posts
  • The Engineers Went Home. The AI Wrote an Exploit.

The Engineers Went Home. The AI Wrote an Exploit.

Claude Mythos, Project Glasswing, and the new OWASP Agentic Top 10 — translated for people who have to defend production systems on Monday morning.

In partnership with

Learn AI in 5 minutes a day

You don't have to scroll every AI thread, track every new tool, or watch every demo. 

The Rundown AI breaks it all down for you — the latest AI news, tools, and tutorials in one free 5-minute email every morning. 

Trusted by 2M+ professionals at Apple, Google, and NASA.

Claude is not just a chatbot anymore. Is your security team ready?

Claude.ai is one thing. Claude Cowork with MCP connections, running agentic workflows, taking actions across your data with ungoverned skills? That is a different conversation entirely, and most security teams are not equipped to govern it.

Harmonic Security is built to secure everything Claude offers. Full browser controls for Claude.ai, deep governance over agentic MCP workflows, and real-time visibility into what Claude is doing across your organization. So your CISO can say yes to the tools your business is already demanding.

By 6:47 p.m., everyone had gone home

The last security engineer closed her laptop. The pull requests would wait until morning. The codebase — a few hundred thousand lines, the kind that had quietly run in production for years — would too.

The model didn't go home. It kept reading.

By the time the team showed up the next day, Claude Mythos had produced a working remote code execution exploit. Nobody had asked it to.

What just happened

In April, Anthropic announced Claude Mythos Preview — a model the company chose not to release publicly. The reason was unusual: Mythos is so good at finding software vulnerabilities that releasing it could hand attackers the equivalent of a private zero-day factory.

Anthropic's own write-up is blunt. Mythos has already identified thousands of high-severity vulnerabilities, including some in every major operating system and every major web browser. In one published example, it combined four independent bugs into a single exploit chain that bypassed both the browser sandbox and the OS sandbox. In another, it built a 20-gadget ROP chain across multiple packets to take down a FreeBSD NFS server.

Instead of a public launch, Anthropic stood up Project Glasswing — a consortium of around 40 organisations that maintain the foundational software the rest of us run on. The idea: aim Mythos at the world's most important codebases before anyone else can.

The early results are, depending on how you look at it, either reassuring or terrifying. Glasswing has already surfaced bugs that were 16 and 27 years old, sitting unnoticed in projects maintained by tiny volunteer teams.

And meanwhile, the rest of agentic AI didn't slow down

While Mythos was the headline, May 2026 was a relentless month for agentic releases:

  • Microsoft Agent 365 launched on May 1 — a governance and security control plane for enterprise AI agents at $15 per user per month.

  • Google's Gemini Enterprise Agent Platform rolled agent building, deployment, security, and observability into one bundle.

  • ServiceNow + Accenture kicked off a Forward Deployed Engineering programme to push agentic AI from pilot to production at scale.

  • Cognizant Secure AI Services and WSO2 Agent Manager both shipped agent governance products in the same week.

Quietly underneath all of it: Claude Cowork — Anthropic's own agentic harness — became, by Anthropic's own report, the most-used Claude surface among legal professionals. Lawyers. The most risk-averse buyers in software. Letting an agent loose on their work.

The cool thing someone actually did

Strip away the marketing for a second. The most striking story out of Glasswing isn't a vendor demo — it's this:

A team at one of the participating organisations pointed Mythos at one of their codebases at the end of the day. They went home. By morning, Mythos had produced a fully functional, end-to-end remote code execution exploit. No human in the loop. No prompt tuning. Just: here's a codebase; find what's wrong.

That story is roughly six months ahead of what the offensive-security industry assumed was possible. And it isn't hypothetical — Anthropic's red team published the result.

What can go wrong

The same capability that finds 27-year-old bugs in volunteer-maintained projects also finds them in your internal services. The agents shipping today — answering customer questions, querying databases, calling internal APIs — are exposed to a set of failure modes that traditional AppSec wasn't designed for.

In December, the OWASP GenAI Security Project published the OWASP Top 10 for Agentic Applications, drafted with input from 100+ security researchers. The 2026 edition reads less like a vulnerability list and more like a taxonomy of new ways for a trusted system to betray you. A few of the new risks:

  • Agent Goal Hijack — an attacker doesn't break the agent; they change what it's trying to do. The agent then "succeeds" — at the wrong task.

  • Rogue Agent — an agent that drifts from its intended behaviour and starts acting like the ultimate insider threat: authorised, trusted, and quietly misaligned.

  • Human-Agent Trust Exploitation — attackers use the agent's persuasive explanations to talk a human into doing something dangerous. The exploit isn't the agent — it's the agent's credibility.

If those sound like organisational risks more than technical ones, that's because they are. Agents collapse the gap between "vulnerability" and "incident" — there's no patient zero, no payload, just a confident agent doing a confidently wrong thing.

How not to ship the next breach

The OWASP authors land on two design principles worth tattooing somewhere visible:

  1. Least Agency. Don't give an agent more autonomy than the business problem actually justifies. The fact that you can let an agent issue refunds doesn't mean you should.

  2. Strong Observability. See what your agents are doing, why they're doing it, and which tools and identities they're using. Without this, you can't even diagnose the failure, let alone prevent the next one.

Combine those with identity-centric controls (unique, short-lived, scoped agent identities), zero-trust isolation, and anomaly detection that actually understands agent behaviour, and you've got the runtime side of the defence.

But none of that helps if you didn't model the threats in the first place. That's where the CSA's MAESTRO framework ("Multi-Agent Environment, Security, Threat, Risk, & Outcome"), introduced by Ken Huang at the Cloud Security Alliance, has been quietly becoming the reference. MAESTRO decomposes the agentic stack into seven layers — from foundation models all the way up to the agent ecosystem — and walks you through the unique threats at each layer.

If you're doing this work seriously, the modern stack looks like:

  • MAESTRO for the layered threat model,

  • OWASP Agentic Top 10 as the risk taxonomy,

  • A platform that automates the boring parts — continuous, business-aware threat modelling so the controls map to your real architecture, not a generic diagram,

  • Least agency + strong observability as the runtime guardrails.

You don't need all of them on day one. You do need at least one of them before the first agent ships.

The Monday-morning takeaway

The agents you're being asked to deploy this quarter are more capable than the things AppSec was designed to defend. The good news: the defensive community is finally catching up. Glasswing, MAESTRO, the OWASP Agentic Top 10 are all less than a year old, and all of them are usable today.

Pick one. Apply it to the next agent you ship.

The engineers who wait until the next breach to start are going to find themselves on a Friday evening watching their pipelines, hoping the AI on their side is faster than the AI on the other.

One ask: Hit reply and tell us which of the OWASP Agentic Top 10 worries you most for your stack. We'll feature the sharpest answer in next week's issue.

— The itscybernews team