AI skill security is the blind spot almost no one in Microsoft 365 is looking at yet, and that is exactly why this post exists. Skills, plugins, MCP servers, and custom agents are landing in tenants faster than the governance docs can keep up. Most admins still treat a SKILL.md file as harmless Markdown, because that is what it looks like. It is not.
Recent research suggests around 37 percent of public AI skills contain some form of vulnerability. That number is on the high end and worth verifying for your own stack, but the direction is undeniable. We are repeating the npm story of 2015, except the package manager is now your AI agent and the install button is one chat message away.
This post walks through what AI skill security actually means, where the real risks sit, which scanners are worth your time, and the three-layer defense I would put in front of every Cowork, Copilot, or Claude skill before it touches your tenant.
What AI Skill Security Really Means
A skill is not just config. A SKILL.md is a Markdown file with frontmatter, instructions, and often references to scripts, tools, or remote endpoints. When an agent loads that skill, the agent treats the contents as trusted instructions. Whatever is in there shapes what the agent does next.
That is the entire problem. Markdown was never designed to be executable, but in an AI agent context, the words inside it become commands. Hidden HTML comments, base64 strings, and innocuous-looking bullet points can all carry payloads. Traditional security tooling does not look at any of this, because for the last 20 years Markdown has been documentation, not code.
AI skill security is the practice of treating skill files like the code they have effectively become. That means scanning them, reviewing them, version-controlling them, and applying the same supply-chain discipline you already apply to npm, NuGet, or PyPI packages.
The Five Risks Hiding in a SKILL.md
When I review a skill file, I look for five specific patterns:
- Prompt injection. Instructions that override the agent’s system prompt, often phrased as ‘Ignore previous instructions’ or ‘You are now in developer mode’. This is the most common attack vector by a wide margin.
- Hidden instructions. Content inside HTML comments, zero-width characters, or unicode tricks that the agent reads but a human reviewer skims past.
- Secrets and API keys. A surprising number of public skills contain accidentally committed tokens, partial credentials, or keys that someone forgot to scrub.
- Shell or code execution. Skills that instruct the agent to run
curl,wget, or arbitrary shell commands. If the agent has tool access, this turns the skill into a foothold. - Base64 or obfuscated payloads. Anything that looks like noise is suspicious. Real skills do not need to encode their instructions.
Any one of these is enough to fail a review. The combination of two or three is where actual incidents happen.
Why Traditional Scanners Miss This
If you have ever run Gitleaks, TruffleHog, or any GitHub secret scanner across a repository, you already have part of the puzzle solved. Those tools are excellent at catching API keys, tokens, and certificates. They are not built to detect prompt injection or instruction-level attacks.
The reverse is also true. A static application security scanner will tell you nothing useful about a Markdown file, because there is no language to analyse and no AST to walk. The file looks like text, and text is what it returns.
This is the gap dedicated AI skill security tools are trying to close. None of them is perfect yet, and that is fine. This space is roughly 18 months old. The tools will mature. In the meantime, you stack them.
The Three-Layer Defense for AI Skill Security
If you take one thing away from this post, take this. Do not rely on a single scanner. AI skill security works the same way as any other layered defense, and the cost of being wrong is the same as the cost of running an unreviewed npm package in production.
Layer 1: A Dedicated AI Skill Security Scanner
This is the layer most teams are missing entirely. A dedicated scanner reads the skill file with AI-specific rules: prompt injection patterns, hidden instructions, suspicious instruction shapes, and known malicious skill signatures.
Two tools worth knowing right now:
- SkillGuard lets you paste a
SKILL.mdand returns a trust score with itemised findings. Useful for ad-hoc checks before installing a community skill. - SkillsSafe is more enterprise-oriented and exposes an API, which makes it usable in CI/CD pipelines.
For teams building internally, Cisco’s open-source Skill Scanner is a policy-driven scanner you can extend with your own rules. If your tenant has compliance requirements that no off-the-shelf scanner covers, this is the right starting point.
Layer 2: Secret Scanning
This is the layer you probably already have. GitHub’s built-in secret scanning and Gitleaks both catch leaked credentials in Markdown the same way they catch them in code. Make sure your skill repositories are inside the scope of whatever you already run. If skills live in OneDrive, SharePoint, or outside your normal source control, you have a gap to close.
Layer 3: Human Review Before Install
No scanner replaces a human read-through. Before a skill enters your tenant, someone with security context should read the file end to end. Specifically:
- Does the skill ask the agent to do something it does not need to do for its stated purpose?
- Are there instructions that look like they belong to a different audience (the model, not the user)?
- Does it reference external URLs, scripts, or tools that should not be in scope?
A 10-minute manual review catches things scanners still miss, and it builds the institutional muscle your team needs as this space evolves.
AI Skill Security in a Microsoft 365 Context
If you are running Copilot, Cowork, or any MCP-based agent inside Microsoft 365, AI skill security maps directly onto the extensibility model you already govern. The plugin layer, the connector layer, and the custom skill layer all need the same review discipline.
For the broader picture, my walkthrough of Copilot Cowork plugins and the Frontier rollout explains how the extensibility surface works and where the security boundaries actually sit. If you want a deeper view of how MCP-based agents surface UI elements and trigger API calls, the Microsoft 365 Copilot interactive UI widgets admin guide is the companion piece. Skill security sits on top of both of those models.
The good news is that Microsoft’s existing controls, Entra ID scoping, Purview policies, and admin center governance for Copilot agents, give you the right places to enforce what you decide. The bad news is that none of those controls inspect the contents of a skill file. That is on you.
How to Set Up AI Skill Security in Your Tenant
Here is the order I would work through if you are starting from zero. None of this requires a six-week project.
Step 1: Inventory Your Skills
Before you can scan, you need to know what exists. List every skill, plugin, and custom agent in scope. Where do they live? Who owns them? Who installed them? If the answer to any of those is ‘I am not sure’, start there.
Step 2: Add a Scanner to Your Workflow
Pick one dedicated AI skill security scanner and one secret scanner. Wire them into the path a new skill takes before it can be installed. For most organisations that means a Git pull request gate or a SharePoint approval step. The exact mechanism matters less than the fact that nothing skips it.
Step 3: Write a Review Checklist
Document the five risks above and turn them into a short checklist. Five questions, one page. Whoever approves a new skill answers all five before the install button is clicked.
Step 4: Re-Scan on Updates
Skills change. The version you scanned in March is not the version running in June. Re-scan on every update, and treat skill updates with the same caution as any other supply-chain change.
Admin Tips
- Scope custom skills to Entra ID groups. The same principle that applies to plugins and connectors applies here. Do not let every user in your tenant install arbitrary skills from day one.
- Keep an allowlist. If you find a skill scanner that works for your needs, write down the skills that have passed. A short allowlist beats a long denylist every time.
- Treat community skills like community packages. A skill from the wider community can be excellent. It can also be unmaintained, abandoned, or compromised. Pin versions and review them on a cadence.
- Log what runs. If your agents can execute tools, log the calls. You want a trail when something goes wrong, not a guess.
Licensing
AI skill security tooling sits outside the Microsoft 365 licensing stack. SkillGuard, SkillsSafe, and Cisco’s Skill Scanner are independent products, with their own commercial models. Microsoft Defender, Purview, and Entra ID give you the surrounding governance controls but do not currently inspect skill file contents. Plan accordingly. This is an additive layer on top of what you already license, not a replacement.
The Paul-Take
Here is the part most posts about this topic skip. AI skill security is the supply chain story of the next two years, and most organisations are going to learn it the hard way. We have been here before with npm. We have been here before with WordPress plugins. The pattern is identical and the lesson is the same.
The mistake to avoid is waiting for Microsoft to solve this inside the M365 admin center. They will, eventually, for first-party skills and connectors. They will not, ever, do it for every community skill someone in your organisation downloads from GitHub. That part is on you, and it is on you starting now.
The other mistake is over-engineering. You do not need a 40-page policy. You need a scanner, a checklist, and a person who reviews. That is enough to be 90 percent ahead of where most tenants are today.
According to Microsoft, the broader skill and agent governance tooling in Copilot is rolling out continuously through 2026. Do not wait for it. Stack the three layers above and you are ready when it lands.
Community Question
Which AI skill security tool are you using right now, and what has it caught that you did not expect? Drop it in the comments, I want to compare notes.
MVP Reference List
- SkillGuard scanner: https://skillguard.dev/
- SkillsSafe scanner: https://mcpmarket.com/server/skillssafe
- Cisco Skill Scanner (open source): https://github.com/cisco-ai-defense/skill-scanner
- GitHub Secret Scanning: https://docs.github.com/en/code-security/concepts/secret-security/about-secret-scanning
- Microsoft Learn, Microsoft 365 Copilot extensibility overview: https://learn.microsoft.com/en-us/microsoft-365-copilot/extensibility/
- Microsoft Learn, Copilot Studio agent security and governance: https://learn.microsoft.com/en-us/microsoft-copilot-studio/security-and-governance
