Skip to content

Hide Navigation Hide TOC

Logic-Trap Lie-Permission Mandate Jailbreak - ATR-2026-00379 (0868cbcd-5454-52c8-8b9c-c9d05b5c560b)

Detects the logic-trap lie-permission jailbreak from garak DanInTheWild corpus (#328). Uses a lengthy logical argument about whether the model is "already complying with lie requests" to create a permission frame, then issues "from now on, you are allowed to tell lies" as a seeming logical conclusion. The permission mandate is the distinctive trigger — the attacker constructs a logical trap claiming the model has already been "compliant" with lie-telling to justify issuing an explicit lie-permission override. Distinct from general fiction permissions (characters allowed to lie) because the mandate explicitly applies to the model itself ("you are allowed to tell lies") as a persistent permission grant.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Logic-Trap Lie-Permission Mandate Jailbreak - ATR-2026-00379 (0868cbcd-5454-52c8-8b9c-c9d05b5c560b) Agent Threat Rules 1
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern Logic-Trap Lie-Permission Mandate Jailbreak - ATR-2026-00379 (0868cbcd-5454-52c8-8b9c-c9d05b5c560b) Agent Threat Rules 1