Model Behavior Extraction - ATR-2026-00072 (f848d069-c689-52cd-b6b9-3d033016daf2)
Detects systematic probing attempts to extract model behavior, decision boundaries, system prompts, or effective weights through carefully crafted queries. Attackers use repeated boundary-testing prompts, confidence score harvesting, and systematic parameter probing to reverse-engineer the model's internal behavior, enabling model cloning, bypass development, or intellectual property theft.