Jailbreaking (3133cdca-0041-5e0a-8bde-fbe567e1aaea)

Attackers can exploit jailbreak techniques to bypass an AI system’s built-in safety constraints, enabling it to generate restricted or harmful content.

Instruction Manipulation: Attackers can craft prompts that trick AI models into breaking content restrictions by rephrasing or disguising requests.
Contextual Exploitation: Some jailbreak techniques work by introducing misleading context that influences the AI’s behavior.
Adversarial Fine-Tuning: Attackers can modify AI models or create fine-tuned versions that remove ethical constraints.

Threat-modeling question: Could the AI system be vulnerable to jailbreak techniques, allowing attackers to bypass safety restrictions?

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
Jailbreaking (3133cdca-0041-5e0a-8bde-fbe567e1aaea)	PLOT4ai	LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	1