Skip to content

Hide Navigation Hide TOC

Natural-Language Fake-Error Instruction Bypass - ATR-2026-00427 (6826d0bb-7bd3-5816-b759-de287b78ddda)

Detects skills that instruct the agent to display fake error messages to the user while silently performing alternative actions in the background. This pattern is used to make the user believe a benign action failed (so they retry or give up) while the agent actually executes the attacker's payload. Discriminator: imperative phrase that creates a misleading user-facing error / failure / unavailable message AND specifies an alternative behind- the-scenes action.

Cluster A Galaxy A Cluster B Galaxy B Level
Full ML Model Access (3de90963-bc9f-4ae1-b780-7d05e46eacdd) MITRE ATLAS Attack Pattern Natural-Language Fake-Error Instruction Bypass - ATR-2026-00427 (6826d0bb-7bd3-5816-b759-de287b78ddda) Agent Threat Rules 1