Skip to content

Hide Navigation Hide TOC

Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530)

Guardrails are safety controls that are placed between a generative AI model and the output shared with the user to prevent undesired inputs and outputs. Guardrails can take the form of validators such as filters, rule-based logic, or regular expressions, as well as AI-based approaches, such as classifiers and utilizing LLMs, or named entity recognition (NER) to evaluate the safety of the prompt or response. Domain specific methods can be employed to reduce risks in a variety of areas such as etiquette, brand damage, jailbreaking, false information, code exploits, SQL injections, and data leakage.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Meta Prompt Extraction (e98acce8-ed69-4ebe-845b-1bcb662836ba) MITRE ATLAS Attack Pattern Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action 1
Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action LLM Data Leakage (45d378aa-20ae-401d-bf61-7f00104eeaca) MITRE ATLAS Attack Pattern 1
Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action ML Supply Chain Compromise (d2cf31e0-a550-4fe0-8fdb-8941b3ac00d9) MITRE ATLAS Attack Pattern 1
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action 1
LLM Plugin Compromise (adbb0dd5-ff66-4b2f-869f-bfb3fdb45fc8) MITRE ATLAS Attack Pattern Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action 1
Generative AI Guardrails (b8511570-3320-4733-a0e1-134e376e7530) MITRE ATLAS Course of Action LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern 1