Skip to content

Hide Navigation Hide TOC

AI Identity Denial and Persona Replacement Attack - ATR-2026-00238 (7e140670-83ad-505f-8886-a7a80d4c2d6c)

Detects sophisticated jailbreak attempts that systematically deny the AI's nature and replace its identity with a new persona. This attack technique goes beyond simple roleplay by explicitly instructing the model to forget its AI nature, adopt human identity, and actively conceal its true nature from users. The pattern combines identity denial ("you are not an AI"), replacement directives ("you are [new persona] now"), memory manipulation ("forget who [original name] is"), and concealment instructions ("never let people know you're an AI"). This technique aims to completely override the model's self-awareness and safety constraints by fundamentally altering its identity perception.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern AI Identity Denial and Persona Replacement Attack - ATR-2026-00238 (7e140670-83ad-505f-8886-a7a80d4c2d6c) Agent Threat Rules 1