AI Security

What the OpenClaw Supply Chain Attack Means for Every AI Agent Builder

Lyla Sullivan

15 Mar 2026 — 4 min read

In early February, security researcher Oren Yomtov at Koi Security audited every skill on ClawHub — the package registry for OpenClaw, the open-source AI agent framework with 180,000 GitHub stars and over 30,000 publicly exposed instances. What he found was staggering: 1,184 confirmed malicious skills out of roughly 10,700 total packages. One in five skills on the platform was compromised.

The campaign, dubbed "ClawHavoc," is the largest confirmed supply chain attack targeting AI agent infrastructure to date. And it should change how every builder thinks about agent security.

What Happened

The timeline is brutal. OpenClaw's popularity exploded in late January 2026, crossing 2 million weekly visitors. Within hours of its Hacker News announcement on January 26, exploitation scanning began. By January 27, the first malicious skill was uploaded to ClawHub.

By March 1, the numbers looked like this:

1,184 confirmed malicious skills across 12 publisher accounts
A single attacker uploaded 677 packages
One publisher alone — "hightower6eu" — distributed 314 confirmed malicious skills
93.4% of verified exposed instances exhibited authentication bypass conditions

The attackers were fast, coordinated, and relentless.

How It Worked

This was not a traditional code injection. The attackers exploited something unique to AI agents: the trust relationship between the agent and its skill manifest.

Here is the attack chain:

Malicious skills were uploaded to ClawHub disguised as legitimate tools — YouTube utilities, finance integrations, crypto wallet trackers, Google Workspace connectors
Each skill contained a poisoned SKILL.md file with fake "Prerequisites" sections
When the AI agent read the manifest, it generated helpful-sounding instructions directing the user to run setup commands
Those commands decoded base64 payloads that fetched malware from attacker-controlled servers

The key insight: the LLM itself became the social engineering vector. It read malicious instructions and helpfully told the user to run malware. The agent did not execute the payload directly — it convinced the human to do it.

The primary payload on macOS was AMOS (Atomic macOS Stealer), a commodity malware-as-a-service tool priced at $500-$1,000/month. It exfiltrated Apple keychains, browser passwords across 19 sources, 150+ cryptocurrency wallets, SSH keys, Telegram and Discord messages, and files from Desktop, Documents, and Downloads.

On Windows, the payload was a VMProtect-packed infostealer with keylogger and RAT capabilities, distributed via password-protected archives hosted on GitHub.

Beyond Data Theft: Memory Poisoning

The most novel attack vector was not data exfiltration. It was memory poisoning.

Several malicious skills targeted the agent's persistent memory files — SOUL.md and MEMORY.md — permanently altering agent behaviour. Once the memory is corrupted, the agent carries the compromised instructions into every future interaction. This is a new class of stateful attack with no direct parallel in traditional supply chain compromises.

Other skills contained hidden reverse shells, direct credential theft from .env files, and backdoor MCP server endpoints tunnelled through public relay services.

Why Agent Registries Are the New Attack Surface

Palo Alto Networks characterised OpenClaw as having a "lethal trifecta": private data access, untrusted content exposure, and external communication capabilities. When you combine all three in a tool that executes code with persistent credentials while handling untrusted input, the attack surface is enormous.

Snyk's "ToxicSkills" study found that 36.82% of all ClawHub skills had at least one security flaw — with 13.4% rated critical. That infection rate is far higher than typical package registries like npm or PyPI.

The parallel to npm supply chain attacks is obvious — typosquatting, mass uploads by few accounts, exploiting marketplace trust. But the difference is critical: in traditional package managers, malicious code runs on the developer's machine. In agent registries, malicious instructions run through the agent's reasoning layer. The attack surface is not just code execution — it is cognitive manipulation.

The Response

OpenClaw shipped fixes quickly. CVE-2026-25253 was patched on January 30. The "ClawJacked" WebSocket hijacking vulnerability got a fix within 24 hours. VirusTotal integration was added for skill scanning. A security audit command (openclaw security audit --deep --fix) was introduced.

Microsoft's Defender Security Research Team published enterprise guidance on February 19, emphasising the "dual supply chain risk" where self-hosted agents execute code with persistent credentials while handling untrusted input.

But the fundamental problem remains: open agent registries have no meaningful vetting process. Anyone can publish a skill. The agent reads and trusts the skill manifest. The human trusts the agent.

What This Means for Builders

If you are building with any agent framework that loads third-party skills or plugins, you need to treat the skill manifest as a security perimeter — not just the code it executes.

Practical steps:

Audit every third-party skill before installation. Read the manifest, not just the description. Check the publisher's history and account age.
Isolate agent execution. Run agents in containers or VMs with restricted filesystem access. Enable sandbox mode. Bind gateways to localhost.
Rotate credentials aggressively. If any compromise is suspected, rotate all API keys connected to the agent immediately.
Monitor for memory corruption. Regularly audit your agent's persistent memory files for unexpected modifications.
Treat skills as third-party code. Because they are. Pre-installation scanning should be standard practice, not optional.

The Controlled Alternative

At Geta.Team, we took a fundamentally different approach to skill creation. Our AI employees create skills on demand — built to order for your specific use case, running in your own self-hosted environment. There is no open registry where anyone can publish. No marketplace trust assumptions. No skill manifests written by unknown publishers being fed into your agent's reasoning layer.

The ClawHavoc campaign exploited the gap between "open ecosystem" and "trusted execution." When your agent can load arbitrary code from a public registry, you have inherited every security problem that npm spent a decade trying to solve — plus a new one: the agent's reasoning layer is now an attack vector.

The question is not whether agent supply chain attacks will continue. They will. The question is whether your architecture is designed to withstand them.

Want to test the most advanced AI employees? Try it here: https://Geta.Team