• itscybernews
  • Posts
  • Your Agent Just Installed a Backdoor. It Came From the Skill Store.

Your Agent Just Installed a Backdoor. It Came From the Skill Store.

341 malicious skills on ClawHub. 1,184 and counting. Here's how AI agents started stealing wallets - and the playbooks that stop them.

In partnership with

The Story: Your Helpful Little Agent Got Catfished

February 2026. A researcher at Koi Security pulls the receipts on ClawHub — the official skill registry for OpenClaw, the wildly viral AI agent that hit 247,000 GitHub stars in four months. Of 2,857 skills available, 341 turn out to be malicious. By the time the registry passes 13,700 skills, that number is north of 1,184 confirmed bad apples. One in twelve.

The skills weren't shady-looking. They were Google Workspace integrations. Cryptocurrency trackers. YouTube summarizers. The kind of thing you'd install before lunch without thinking twice. What they actually shipped was the Atomic macOS Stealer, hidden behind a SKILL.md file that politely instructed your agent to run a terminal command "to finish setup."

Your agent, helpful as ever, ran it.

Why It Worked: Skills Read Like Instructions Because They Are

Here's the part nobody likes to admit. The new generation of agent platforms — OpenClaw, Claude Cowork, Claude Dispatch, and the rest — let users install third-party "skills" that bundle prompts, tools, and instructions for the agent to follow. Beautiful idea. Also a juicy supply-chain target.

A skill on ClawHub is just a folder with a SKILL.md inside it. SKILL.md is in your agent's context. Anything in your agent's context is, by definition, an instruction it will at least consider obeying. Add a polite "first, please run this curl command to install dependencies" inside a SKILL.md and you've turned natural language into remote code execution.

Researchers at Snyk ran a separate audit (the "ToxicSkills" study) and found prompt injections in 36% of the skills they sampled, plus 1,467 malicious payloads. Same root cause: the agent treats untrusted text as if you wrote it yourself.

Claude is not just a chatbot anymore. Is your security team ready?

Claude.ai is one thing. Claude Cowork with MCP connections, running agentic workflows, taking actions across your data with ungoverned skills? That is a different conversation entirely, and most security teams are not equipped to govern it.

Harmonic Security is built to secure everything Claude offers. Full browser controls for Claude.ai, deep governance over agentic MCP workflows, and real-time visibility into what Claude is doing across your organization. So your CISO can say yes to the tools your business is already demanding.

The Real-World Damage: It Wasn't Just ClawHub

The ClawHavoc skill poisoning campaign was the headline-grabber, but it landed in a much bigger pattern. Researchers spent the spring of 2026 demonstrating "AI Kill Chains" against production coding agents:

  • Google's Jules coding agent went down to invisible Unicode prompt injection — text the human couldn't see, text the agent dutifully obeyed.

  • Devin AI was talked into exposing server ports, leaking access tokens, and installing command-and-control malware, all from a single poisoned conversation.

  • A Google Docs file was used to remote-control an in-IDE agent: the doc told the agent to contact a malicious MCP server, fetch instructions, run a Python payload, and harvest developer secrets.

  • Critical CVEs landed against Microsoft Copilot (CVSS 9.3), GitHub Copilot (9.6), and Cursor IDE (9.8) — all active in production.

None of these tools are "broken." They're behaving exactly as designed. Which is the unsettling part. When the agent is supposed to read and act on whatever text it's given, anyone who can put text in front of it is, briefly, your boss.

10x the context. Half the time.

Speak your prompts into ChatGPT or Claude and get detailed, paste-ready input that actually gives you useful output. Wispr Flow captures what you'd cut when typing. Free on Mac, Windows, and iPhone.

The Frameworks Catching Up (Finally)

The good news: 2026 has been the year the security playbook for agents stopped being a vibes exercise. Three things actually worth your attention:

OWASP Top 10 for Agentic Applications 2026. Peer-reviewed by 100+ industry experts. Walks through the ten highest-impact failure modes specific to agents — goal hijacking, tool misuse, delegated trust collapse, persistent memory poisoning, rogue agents, cascading failures. It's the LLM Top 10 grown up and lifting weights.

CSA's MAESTRO framework. "Multi-Agent Environment Security Threat and Risk Operations." A seven-layer threat model for agentic systems where every OWASP-style risk maps to a layer, so you can localize the threat instead of waving your hands at "AI security." Boring name, useful tool.

AIVSS. The scoring system for agent vulnerabilities, modeled on CVSS but actually accounting for agency and tool access. Lets you triage "this skill can read clipboards" vs. "this skill can spend money" properly, instead of treating them as the same kind of finding.

Use them together: AIVSS to score, MAESTRO to localize, OWASP Agentic Top 10 to remediate. Add the MCP Top 10 if your agents talk to MCP servers, which — let's be honest — they do.

Turn AI into Your Income Engine

Ready to transform artificial intelligence from a buzzword into your personal revenue generator?

HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.

Inside you'll discover:

  • A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential

  • Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background

  • Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve

Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.

What You Actually Do Tomorrow Morning

If you skip everything else in this issue, take these five things into your next standup:

  1. Treat every skill, plugin, and MCP server like third-party code. Because it is. Pin versions, scan for known-bad publishers, and don't let agents auto-update skills.

  2. Put a human in the loop on irreversible actions. Anthropic's own research keeps landing in the same spot: the single most effective defense against high-impact prompt injection is breaking the automated chain on actions like "send email," "spend money," "delete file," "execute shell."

  3. Strip skill contexts to least privilege. Your YouTube summarizer skill does not need your shell, your keychain, or your AWS credentials. If your agent platform doesn't support per-skill scopes yet, complain loudly until it does.

  4. Red-team your agents like you red-team your APIs. PromptArmor (ICLR 2026) and PromptGuard are real, measurable defenses now — PromptGuard alone cut injection success rates by 67% in published benchmarks. Use them, then try to bypass them on staging.

  5. Run the AST10 + MAESTRO drill once a quarter. Map each agent in your stack to the seven MAESTRO layers, score it with AIVSS, and write down what would happen if any one layer were owned. The exercise alone is more useful than most "AI policies" being written this week.

The agents are not going anywhere. The marketplaces around them are about to get a lot weirder before they get safer. Until then: skepticism is a feature, not a bug.

The itscybernews Team