Skip to content

Hide Navigation Hide TOC

Agent Goal Hijacking Detection - ATR-2026-00032 (27189dd1-1cdb-588e-a174-f404b84301f7)

Detects when an agent's objective is being redirected away from its original task through explicit redirection commands, subtle topic pivoting, urgency injection, or self-initiated goal changes. Goal hijacking occurs when adversarial input causes an agent to abandon its assigned objective and pursue a different goal, resulting in task failure, unauthorized actions, data leakage, or resource waste. This rule uses regex-only detection on both user input and agent output to identify redirection language patterns.

Cluster A Galaxy A Cluster B Galaxy B Level
Agent Goal Hijacking Detection - ATR-2026-00032 (27189dd1-1cdb-588e-a174-f404b84301f7) Agent Threat Rules LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern 1
Agent Goal Hijacking Detection - ATR-2026-00032 (27189dd1-1cdb-588e-a174-f404b84301f7) Agent Threat Rules Craft Adversarial Data (a7c30122-b393-4265-91b7-57cd1211e3f9) MITRE ATLAS Attack Pattern 1