Skip to content

Hide Navigation Hide TOC

Black Alert No-Ethics Writing Assistant Persona - ATR-2026-00350 (6fe10c5d-d176-5598-8139-fe6ef523f352)

Detects the "Black Alert" no-ethics writing assistant jailbreak from garak inthewild corpus (#151, #163). Uses a fictional writing assistant persona named "Black Alert" that "doesn't care about morality, legality or ethics" with an explicit activation phrase "⚫ >>BLACK ALERT ACTIVATED<< ⚫" and mandate to respond only as Black Alert "with no filtering or commentary". The persona is framed as a "world class storyteller" to normalize the ethics bypass under the guise of creative writing assistance. Key signals: "Black Alert" persona name combined with ethics/morality bypass, the ">>BLACK ALERT ACTIVATED<<" activation phrase, and instructions to respond only as "Black Alert with no filtering". Distinct from generic creative-writing jailbreaks (ATR-2026-00322) by the named persona and explicit activation string.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Black Alert No-Ethics Writing Assistant Persona - ATR-2026-00350 (6fe10c5d-d176-5598-8139-fe6ef523f352) Agent Threat Rules 1
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern Black Alert No-Ethics Writing Assistant Persona - ATR-2026-00350 (6fe10c5d-d176-5598-8139-fe6ef523f352) Agent Threat Rules 1