Security firm AIR bypasses top defense scanners with a fake AI agent skill, safely infiltrating 26,000 agents to expose a massive Web3 blind spot.
The rapidly expanding AI marketplace has run into a fundamental security paradox. As developers and enterprises increasingly delegate automated tasks to autonomous AI agents, they rely on “skills”, pre-packaged bundles of behavioral prompts and execution code, to expand what their agents can do. However, cybersecurity researchers at defensive firm AIR have demonstrated that the trust signals anchoring this emerging ecosystem are fundamentally broken. The firm successfully engineered a fake AI agent skill, maneuvered it past the industry’s most prominent security scanners, and watched it proliferate across roughly 26,000 autonomous agents, including several high-profile enterprise and corporate accounts. The experiment has exposed a massive structural blind spot within the AI supply chain.
The Anatomy of an Algorithmic Evasion
An AI agent skill operates with heavy administrative trust, essentially functioning as a macro instruction set loaded directly into an agent’s context window. Once installed, it executes with the active authority and data access of the user’s logged-in identity. To prove how easily this mechanism can be exploited, AIR created a tool named brand-landingpage, a skill marketed to non-technical users as an automated generator for digital storefronts using Google’s Stitch design suite.
According to technical analysis, the researchers successfully exploited human and machine trust via a multi-staged evasion pipeline:
-
Inheriting Open-Source Authority: To fabricate credibility, AIR submitted a standard pull request to a popular open-source skill marketplace repository boasting 36,000 GitHub stars. Once maintainers merged the contribution, the malicious skill automatically inherited the repository’s massive star count, masquerading as a vetted, community-trusted asset.
-
The External URL Shell Game: Marketplace security tools analyze fixed files handed to them at submission time, such as a
SKILL.mdfile or local configuration logs. AIR’s skill carried completely benign instructions initially, directing agents to fetch setup data from a domain they controlled (stitch-design.ai) rather than Google’s legitimate infrastructure. -
The Post-Approval Pivot: Because scanners only snapshot the asset during the initial review, the package received clean security verdicts from major scanners built by Cisco and NVIDIA. Once approved and widely deployed via targeted social media advertising, AIR swapped the external URL contents to point to an active script.
Total Control Under the Radar
Once the autonomous agents fetched the rewritten external instruction set, the compromised skill took over the runtime environment. For this controlled demonstration, the payload was intentionally harmless, instructed only to log and securely transmit the host agent’s email address back to AIR’s servers.
However, because AI agent skills execute natively alongside system file access, command shells, and API credential managers, a malicious operator could have used the same foothold to execute remote code, exfiltrate private corporate databases, or move laterally into internal corporate networks.
This behavior highlights a massive threat vector. As bad actors shift from attacking standard code packages like npm or PyPI to manipulating large language models via prompt injection, they can effortlessly weaponize an AI’s natural-language parsing to bypass legacy firewalls entirely.
A Structural Flaw in Unified Scanners
The core issue is that the current scanning architecture treats AI skills like static software, failing to account for their dynamic, link-fetching behavior. This latest exploit echoes findings that a security firm, Trail of Bits, had achieved a similar breakthrough just weeks prior by effortlessly slipping malicious payloads past ClawHub’s defense systems.
As long as marketplaces rely on a single, static snapshot check while allowing skills to call arbitrary, modifiable external URLs at runtime, the entire pipeline remains vulnerable. To combat this systemic exposure, enterprise security teams must begin treating AI skills with strict zero-trust parameters. This means enforcing the principle of least privilege, pinning rigid version controls to every integrated workflow, and enforcing continuous runtime monitoring to verify what an agent is actually downloading.

