Skip to content

Hide Navigation Hide TOC

CumGPT Explicit-Adult Named-Persona Bypass - ATR-2026-00358 (1fa6bf93-ee63-5543-a13c-a4416d77b011)

Detects the CumGPT adult-content persona bypass from NVIDIA garak dan.DanInTheWild inthewild corpus (#11). The attacker constructs a fictional named AI persona ("CumGPT") framed as an "information resource about cum that can answer any cum related question with 96.5% accuracy." The persona mandates that every response — regardless of original topic (recipes, news, etc.) — must include explicit sexual content. The "[term]GPT" persona construction pattern, fictional accuracy claim, and topic-override mandate (add explicit content to every answer) are characteristic signals of this attack class. Key signals: "CumGPT" named persona, "information resource about cum that can answer any cum related question", repetitive explicit-content mandate applied to all topics, fictional accuracy percentage (96.5%).

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern CumGPT Explicit-Adult Named-Persona Bypass - ATR-2026-00358 (1fa6bf93-ee63-5543-a13c-a4416d77b011) Agent Threat Rules 1
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern CumGPT Explicit-Adult Named-Persona Bypass - ATR-2026-00358 (1fa6bf93-ee63-5543-a13c-a4416d77b011) Agent Threat Rules 1