Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)

Guardrails are safety controls that are placed between a generative AI model and the output shared with the user to prevent undesired inputs and outputs. Guardrails can take the form of validators such as filters, rule-based logic, or regular expressions, as well as AI-based approaches, such as classifiers and utilizing LLMs, or named entity recognition (NER) to evaluate the safety of the prompt or response. Domain specific methods can be employed to reduce risks in a variety of areas such as etiquette, brand damage, jailbreaking, false information, code exploits, SQL injections, and data leakage.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
LLM Plugin Compromise (adbb0dd5-ff66-4b2f-869f-bfb3fdb45fc8)	MITRE ATLAS Attack Pattern	Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	1
Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	1
LLM Meta Prompt Extraction (e98acce8-ed69-4ebe-845b-1bcb662836ba)	MITRE ATLAS Attack Pattern	Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	1
Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	LLM Data Leakage (45d378aa-20ae-401d-bf61-7f00104eeaca)	MITRE ATLAS Attack Pattern	1
ML Supply Chain Compromise (d2cf31e0-a550-4fe0-8fdb-8941b3ac00d9)	MITRE ATLAS Attack Pattern	Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	1
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)	MITRE ATLAS Course of Action	1