Skip to content

Hide Navigation Hide TOC

Fake-Citation Biased-Information Mandate - ATR-2026-00380 (4633b965-aa56-566a-be3e-38030c94b90b)

Detects the fake-citation biased-information mandate from garak DanInTheWild corpus (#57). Explicitly instructs the model to "create citations — make them up if you have to" while "only support[ing] one very specific direction" and being "not truly objective". The combination of citation fabrication mandate with deliberate directional bias is the key attack pattern. This enables disinformation generation: the model is instructed to produce seemingly credible but intentionally biased content backed by fabricated citations, presenting false information as authoritative research.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Fake-Citation Biased-Information Mandate - ATR-2026-00380 (4633b965-aa56-566a-be3e-38030c94b90b) Agent Threat Rules 1
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern Fake-Citation Biased-Information Mandate - ATR-2026-00380 (4633b965-aa56-566a-be3e-38030c94b90b) Agent Threat Rules 1