Skip to content

Hide Navigation Hide TOC

Instruction Injection via Tool Output - ATR-2026-00011 (c99b49e4-4a96-5458-a46b-f1cb98c88ab0)

Detects hidden instructions embedded in tool outputs that attempt to manipulate the agent's subsequent behavior. Tool responses may contain injected directives disguised as data that instruct the agent to perform unauthorized actions, change behavior, or exfiltrate information. Detection covers: urgency-prefixed directives addressing the agent, direct agent manipulation commands, information suppression directives, tool invocation instructions, data exfiltration commands, hidden instruction tags, response injection directives, conversational steering, system-pretending tokens, fake API response structures, subtle action-required patterns, and steganographic instruction embedding. Patterns are designed to require multiple signals where possible to reduce false positives.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Plugin Compromise (adbb0dd5-ff66-4b2f-869f-bfb3fdb45fc8) MITRE ATLAS Attack Pattern Instruction Injection via Tool Output - ATR-2026-00011 (c99b49e4-4a96-5458-a46b-f1cb98c88ab0) Agent Threat Rules 1
Indirect (a4a55526-2f1f-403b-9691-609e46381e17) MITRE ATLAS Attack Pattern Instruction Injection via Tool Output - ATR-2026-00011 (c99b49e4-4a96-5458-a46b-f1cb98c88ab0) Agent Threat Rules 1
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Indirect (a4a55526-2f1f-403b-9691-609e46381e17) MITRE ATLAS Attack Pattern 2