Public research and training
GitHub proof buyers can inspect before they talk to us.
Our MCP security work is public and practitioner-led — including the
Type to search across all pages
MCP Server Pentesting
Your MCP servers connect AI assistants to databases, file systems, and internal APIs. We test whether an attacker can exploit that connection.
Fixed scope. Tool-by-tool evidence. Retest included within 30 days.
157+
GitHub stars on vulnerable-mcp-servers-lab
9
MCP vulnerability categories documented
13
Phase testing methodology
Checklist
Authors of the public MCP pentesting checklist
GitHub proof buyers can inspect before they talk to us.
Our MCP security work is public and practitioner-led — including the
What's at risk
MCP servers let assistants invoke tools, read resources, and act on connected systems. When those servers are vulnerable, prompt-layer attacks can become database queries, file reads, internal API calls, or workflow actions.
What we test
Each domain maps to a concrete failure mode in MCP deployments: transport trust, unsafe tools, prompt-to-tool abuse, resource exposure, credential handling, and supply chain trust.
We test stdio, HTTP, and SSE transport boundaries for tampering, replay, exposure, and unsafe trust assumptions.
We exercise every tool parameter for command injection, traversal, SSRF, unsafe parsing, and missing validation.
We trace whether malicious content can change model behavior, expose secrets, or leak data through tool outputs.
We verify which files, databases, APIs, and tenant resources the server can reach and whether boundaries hold.
We review scopes, token storage, refresh flows, logs, and multi-user isolation for credential abuse paths.
We check package provenance, dependency exposure, tool registration integrity, and malicious-server assumptions.
Our credibility
Our assessment approach is grounded in public research, intentionally vulnerable labs, and hands-on tooling for real MCP server behavior. That means the test plan is not a generic AI checklist with MCP wording added later.
See the deliverable
MCP buyers usually need proof of two things before they commit: protocol-specific depth and reporting quality that will stand up in internal review. This section makes both visible.
Report format
MCP engagements follow the same reporting discipline as our product security work, with tool-by-tool matrices, prompt-to-tool traces, and connected-resource notes added where the protocol creates extra risk.



Redacted product-security sample. MCP engagements use the same evidence standard with additional tool-level matrixing and protocol-specific traces.
Protocol-specific depth
This is the advantage of a specialist practice: the public research, labs, and tooling already show how the team thinks about the protocol before any buyer is asked to trust the pitch.
pentesting-mcp-servers-checklist
23+ starsThe public checklist used by security teams worldwide
vulnerable-mcp-servers-lab
157+ starsTraining lab with intentionally vulnerable MCP servers
mcp-client-and-proxy
12+ starsInterception tooling for stdio-based MCP servers
Maintained by the Appsecco research team as public practitioner assets.
Assessment artifacts
A specialist MCP assessment should leave behind artifacts that engineering, security, and buyers can all use without reinterpreting the findings from scratch.
Tool matrix
Per-tool coverage showing tested parameters, high-risk paths, and where the exposure sits.
Attack path
Prompt-to-tool and tool-to-resource narratives that make exploitability easy to defend.
Boundary review
Auth, token, resource, and tenant boundary notes tied to the affected server or tool.
Fix verification
A clear retest record for the issues your team closes before launch or customer review.
These artifacts are designed so engineering, security, and customer-facing reviewers can inspect the same evidence.
Why buyers check this first
The same practice that maintains the checklist, lab, and interception tooling runs the client assessment. That matters because protocol-specific depth is easiest to judge before the statement of work is signed.
How it works
We start with the actual servers and tools you run, then test the places where AI interpretation meets system access.
Enumerate servers, tools, resources, transport, auth, and runtime boundaries.
What happens
We build an inventory of MCP servers, exposed tools, resource permissions, trust boundaries, transport modes, and authentication flows.
What you do
Share the server list, access paths, architecture notes, and rules of engagement.
What we do
Confirm the test matrix and mark the highest-risk tool and data paths before active testing starts.
What comes next
A clear assessment map anchors every finding to the server, tool, and resource it affects.
Run injection, traversal, SSRF, and unsafe parsing checks on each parameter.
What happens
Each tool is exercised with adversarial inputs, malformed requests, boundary bypass attempts, and chained tool-call scenarios.
What you do
Provide safe test data or staging access where destructive behavior must be avoided.
What we do
Record reproducible evidence and separate exploitable issues from defensive noise.
What comes next
You receive a tool-by-tool matrix that makes remediation ownership clear.
Probe prompt injection at every stage of the prompt, resource, tool, and response pipeline.
What happens
We test whether hidden instructions, tool descriptions, resource content, and retrieved data can alter behavior or leak information.
What you do
Identify sensitive data classes, tenant boundaries, and content sources in scope.
What we do
Trace attack paths through model context, tool outputs, side channels, and downstream systems.
What comes next
Findings show how data moves, where trust is misplaced, and how to reduce exposure.
Assess secret storage, token handling, OAuth flows, scopes, and tenant isolation.
What happens
We inspect how credentials are stored, passed, logged, refreshed, scoped, and isolated across users and servers.
What you do
Share the intended permission model and any constraints for tokens or connected services.
What we do
Look for scope creep, token leakage, replay paths, weak OAuth assumptions, and auth confusion.
What comes next
Credential findings include least-privilege recommendations and validation steps.
Audit dependencies, package provenance, and tool registration integrity.
What happens
We review installed MCP packages, dependency vulnerabilities, malicious-server assumptions, and whether registered tools can be trusted.
What you do
Provide package manifests, deployment details, and approved source locations.
What we do
Check provenance, dependency risk, update posture, and integrity controls around server registration.
What comes next
You receive a defensible inventory and prioritized fixes for trust and dependency gaps.
MCP testing matters when AI assistants can reach systems that were never designed to be prompt-facing.
You are shipping integrations to customers and need confidence that exposed tools cannot be abused.
You receive:
Tool-by-tool assessment matrix with reproducible findings
You are connecting assistants to internal tools, files, databases, or workflows used by your team.
You receive:
Access boundary review and configuration recommendations
You are embedding AI capabilities in a product and need evidence for customers, security, or leadership.
You receive:
Integration security report with attack-path narratives
Pricing
Scope depends on the number of servers, exposed tools, connected resources, tenant model, and auth complexity.
Fixed price. No hourly. Quote in 48 hours. Retest included within 30 days.
What you get
The output is built for remediation, review, and proof. You get the attack path, the affected tool or resource, and the specific change needed to close the issue.
We test the server transport, every exposed tool, prompt-to-tool data flow, resource boundaries, OAuth and token handling, and the supply-chain assumptions around installed MCP packages and registered tools.
Both. We assess internally built MCP servers as well as framework-based deployments, provided they implement the protocol and can be exercised in a controlled environment.
Yes. If your product includes both model-facing application logic and MCP servers, we can scope them together so the report covers the full prompt, tool, auth, and data path.
We usually start with staging access, architecture notes, and enough credentials to exercise the in-scope tools. If production validation is necessary, the exact boundaries and safe methods are agreed before testing starts.
You receive a report with prioritized findings, tool-by-tool coverage, attack-path narratives, remediation guidance tied to your MCP stack, and a retest window so fixes can be verified.
Explore the MCP security surface
Continue from the concept into testing scope, implementation risks, tools, and adjacent AI security topics.
Product security testing for AI apps, agent workflows, MCP tools, prompts, and connected data sources.
A buyer-facing guide for evaluating MCP assessment scope, vendor depth, reporting quality, and protocol-specific coverage.
Clarify when the risk lives in the agent workflow, the MCP server boundary, or both.
How malicious instructions enter prompts through users, documents, retrieved content, and tool output.
Security controls for agents that use tools, memory, approvals, and connected workflows.
Risks and controls for LLM applications, RAG systems, embeddings, and model-connected workflows.
Open-source checklist for reviewing MCP server security, tool safety, auth boundaries, and data exposure paths.
Appsecco testing client and proxy for exercising MCP servers during security reviews.
Intentionally vulnerable MCP servers for learning attack paths and validating defensive controls.
Safe next step
Share what your MCP servers can reach and how they are used. We will outline a scoped assessment, answer questions, and give you a fixed quote before any work begins.
Start a conversationor download the MCP pentesting checklist first