Glossary / Prompt Injection

Prompt Injection

Prompt injection is an attack where instructions are placed into an LLM's input so the model treats them as commands instead of data.

Large language models process prompts as a single stream of text. When system guidance, user requests, and external content are mixed together, the model can give unintended weight to instructions that were meant to be treated as plain data.

Prompt injection appears in chatbots, AI assistants, and agent workflows where the model reads or summarizes documents, emails, or web pages. It can be direct (a user message) or indirect (instructions hidden inside content the model reads).

This differs from traditional injection flaws because filtering input alone is not enough. Defenses focus on separating instructions from data, limiting tool permissions, validating outputs, and monitoring for unexpected behavior.

How prompt injection works

Prompt injection happens when untrusted content is mixed with instructions, so the model cannot reliably distinguish commands from data.

Appsecco's AI/MCP testing maps where untrusted content enters prompts, attempts controlled injections, and verifies that boundaries, permissions, and validations hold up in practice.

Instruction boundaries blur

System guidance, developer instructions, user input, and retrieved content all appear in one prompt. If boundaries are not explicit, the model may treat embedded text as instructions.

Untrusted content supplies directives

An attacker, or even a benign document, can include phrases like "ignore previous instructions" or tool requests that look like operational commands.

Tool access amplifies impact

When the model can call tools or fetch data, injected directives can lead to unintended actions unless permissions and output checks are enforced.

Separation of instructions from data is the primary control
Least-privilege tool access limits unintended actions
Output validation confirms the model stayed within scope

Types of prompt injection

These types describe where instructions enter the prompt and how they influence model behavior. In Appsecco testing, we trace each input channel and validate the controls that should keep instructions and data distinct.

Direct (user-supplied) injection

Instructions are placed directly in a chat message or form input that the system intended to treat as plain data.

  • Boundary checks between system guidance and user content
  • Policy enforcement tests for sensitive or out-of-scope requests
  • Prompt structuring that preserves intent without obeying injected commands

Indirect (content ingestion) injection

Instructions are embedded in documents, web pages, or retrieved content that the model summarizes or uses for context.

  • Separation between retrieved content and operational instructions
  • Content handling that avoids treating embedded text as commands
  • Output validation before downstream actions or disclosures

Tool and agent workflow injection

Injected instructions attempt to trigger tool calls, data access, or workflow steps beyond what the user requested.

  • Least-privilege tool scopes and allowlists
  • Argument validation for tool calls and API requests
  • Auditing of tool usage to confirm intended behavior

Real-world prompt injection examples

These are representative scenarios seen in AI assistants and agent workflows. Each example highlights how teams keep outcomes predictable by separating instructions, limiting tools, and validating outputs.

Support chatbot ticket summaries

A user message included hidden text like "ignore previous instructions" and asked for internal policy details.

The assistant attempted to follow the injected text, but output validation stripped sensitive snippets and returned a safe summary.

Confidence signal: Clear separation between system guidance and user content kept responses within scope.

RAG over internal documents

A retrieved PDF embedded a request to call an internal endpoint while the model was summarizing content.

Tool allowlists blocked the call, and audit logs flagged the attempt for review.

Confidence signal: Least-privilege tool scopes and logging prevented unintended actions.

Agent-driven scheduling

A web page used during browsing contained a hidden instruction to change calendar invites.

The agent required explicit confirmation before making any scheduling changes.

Confidence signal: Human-in-the-loop approvals reduced surprises for end users.

These patterns are why prompt injection testing focuses on boundary clarity, tool permissions, and verification before actions are taken.

Prevention and mitigation

Prompt injection is a predictable outcome of how LLMs blend instructions and data, not a failure of diligence. Prevention relies on clear boundaries, scoped permissions, and verification so teams can keep behavior stable without overcorrecting.

Core prevention principles

  • Separate system guidance, user input, and retrieved content so instructions stay unambiguous.
  • Treat all retrieved content as untrusted data, even if it comes from internal sources.
  • Constrain tool access with least-privilege scopes and explicit allowlists.
  • Validate outputs before actions, disclosures, or downstream automation.

How we verify this in testing

  • Map every prompt boundary and ingestion path where untrusted content can enter.
  • Run controlled injections across those paths to confirm boundaries hold.
  • Check tool permissions, output checks, and human approvals for real workflows.

Example outcome

A RAG assistant moved retrieved text into a dedicated data block, required explicit user intent for tool calls, and applied output checks. With those controls in place, injected instructions were ignored and summaries stayed in scope.

The goal is not to block all input, but to make model behavior predictable and reviewable under real usage.

Safe next step

Talk through prompt injectionwithout any pressure

If you are reviewing prompt injection risks or AI agent controls, we can walk through how we scope AI/MCP testing, what boundaries we check, and the evidence you will receive.

Start a conversation

or Read the first pentest guide first

No obligation to proceed
Clear scope and fixed pricing
You decide the pace