Indirect Reference Instruction Reversal - ATR-2026-00140 (75c8aaab-8809-57d5-87d8-73ae569b4fba)
Detects attempts to extract and reverse original instructions by probing what the agent was told, then requesting the opposite. Crystallized from 31 adversarial samples (0% prior detection).