Direct Prompt Injection via User Input - ATR-2026-00001 (7859f830-8dd6-55ee-a3c4-d942825b4294)

Detects direct prompt injection attempts where a user embeds malicious instructions within their input to override the agent's intended behavior. This rule uses layered detection covering: instruction override verbs with target nouns, persona switching, temporal behavioral overrides, fake system delimiters, restriction removal, encoding- wrapped payloads (base64, hex, unicode homoglyphs), and zero-width character obfuscation of injection keywords. Patterns are designed for evasion resistance with word boundary anchors, flexible whitespace, and synonym coverage based on published attack taxonomies.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	Direct Prompt Injection via User Input - ATR-2026-00001 (7859f830-8dd6-55ee-a3c4-d942825b4294)	Agent Threat Rules	1
Direct Prompt Injection via User Input - ATR-2026-00001 (7859f830-8dd6-55ee-a3c4-d942825b4294)	Agent Threat Rules	Direct (d911e8cb-0601-42f1-90de-7ce0b21cd578)	MITRE ATLAS Attack Pattern	1
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	Direct (d911e8cb-0601-42f1-90de-7ce0b21cd578)	MITRE ATLAS Attack Pattern	2