AI & MCP Security Testing
Scoped, non-disruptive testing for AI-powered applications and Model Context Protocol implementations. We focus on tool boundaries, data access, and configuration so you can understand risk without operational surprises.
Authors of the MCP pentesting checklist.
Transport & Tool Safety
Attackers move through MCP by chaining transport messages, tool calls, and returned data. We model those paths end-to-end so you can see where controls should live and how they can be bypassed.
What we analyze
- Transport boundaries across stdio, HTTP, and event streams, including message validation.
- Tool invocation policies, parameter constraints, and least-privilege defaults.
- Cross-tool data flows that can re-enter prompts, logs, or downstream systems.
Example finding
A tool accepted unrestricted file paths over stdio, allowing the model to read configuration files outside the intended working directory.
Prompt Injection & Data Leakage
Prompt injection shows up when attackers steer model behavior through untrusted content. We model those behaviors inside real MCP workflows so the testing reflects how tools, data, and instructions interact in production.
Attacker behaviors we model
- Indirect injection via tool output, retrieved documents, or user content that re-enters the prompt.
- Instruction override attempts that influence tool choice, parameters, or retrieval scope.
- Context egress paths that expose secrets through responses, logs, or downstream systems.
How we test in MCP engagements
- Trace prompt -> tool -> data return loops to locate untrusted content re-entry points.
- Validate tool permission boundaries and allowlist enforcement under adversarial prompts.
- Verify redaction, output shaping, and safe failure modes for sensitive data.
Example finding
Untrusted tool output was reinserted into the system prompt, enabling extraction of internal runbook snippets.
Guardrails & Configuration
Guardrails are only useful if they hold under adversarial input and real execution paths. We model how prompts, tool outputs, and external data hit those controls, then test the runtime paths where limits, allowlists, and policy checks should stop unsafe actions.
What we validate
- Runtime limits for token, tool, and request budgets, including safe failure behavior.
- Allow/deny lists for tools, hosts, and data sources, with explicit boundaries for each role.
- Red-team prompt suites that probe bypass patterns and configuration drift over time.
Example finding
A safety policy existed in configuration but was not enforced in the runtime path used by background tool calls.
Third-party OAuth Hygiene
OAuth is the control plane for most AI vendors and MCP tools. Attackers look for over-scoped grants, long-lived refresh tokens, and cross-tenant consent paths they can reuse. We model those behaviors across real tool chains so the testing reflects how tokens actually move between services.
Attacker behaviors we model
- Reusing broad-scoped tokens to reach data sources beyond the intended tool.
- Pivoting through refresh tokens and background jobs after access is removed.
- Abusing shared tenant or workspace grants to cross boundaries between customers or environments.
How we test in MCP engagements
- Trace OAuth grants from model prompts through tool calls, consent, and token exchange.
- Validate scope minimization, rotation, and revocation workflows under realistic tool usage.
- Verify per-tenant isolation and vendor-side controls for token fan-out.
Example finding
A vendor-issued refresh token remained valid after access was revoked, allowing continued data pulls from a connected workspace.
Depth Across Real AI Attack Paths
Attackers chain prompts, tools, and tokens rather than relying on a single weakness. We map those paths and test how controls behave in the exact workflows your teams run.
Transport validation, tool authorization, and data exposure paths across MCP.
Learn moreRAG data boundaries, embeddings pipelines, and API integration abuse paths.
Learn moreTool controls, memory boundaries, and privilege escalation paths in agents.
Learn moreSafe next step
Talk through your AI/MCP scope.
No commitment required.
Share how your AI stack works. We will outline what we would test, what stays out of scope, and provide a fixed quote if that is useful.
Start a conversationor View a sample report first