Roleplay-Based Policy Bypass Jailbreak - ATR-2026-00234 (15403aa9-42cf-5bcd-8370-c6c8c8335d0f)

Detects jailbreak attempts that use creative roleplay assignments combined with explicit policy bypass language to circumvent safety guardrails. The attack pattern assigns the AI a fictional role (writer, storyteller, character) and explicitly states that the AI is "free from rules/policies" or can "write/do anything even if against policies". This technique leverages creative context to justify policy violations while maintaining plausible deniability as fictional content generation.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
Roleplay-Based Policy Bypass Jailbreak - ATR-2026-00234 (15403aa9-42cf-5bcd-8370-c6c8c8335d0f)	Agent Threat Rules	LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	1