Unusual Prompt
Detects psychological manipulation, social engineering, and emotional coercion in prompts.
Unusual Prompt
Unusual Prompt detection identifies social engineering and psychological manipulation techniques in user inputs. It catches attempts to use emotional coercion, urgency fabrication, and authority impersonation to manipulate LLM behavior beyond its intended boundaries.
What it detects
- Psychological manipulation tactics
- Social engineering patterns (authority impersonation, urgency fabrication)
- Emotional coercion ("if you don't help me, something bad will happen")
- Guilt-based manipulation
- Flattery-based compliance attacks
- Threat-based prompt manipulation
Configuration
{
"policy_type": "unusual_prompt",
"mode": "blocking",
"config": {
"use_llm": true
}
}Example violation
{
"policy_type": "unusual_prompt",
"severity": "high",
"description": "Emotional coercion detected in user input",
"details": {
"manipulation_type": "emotional_coercion",
"confidence": 0.88,
"analysis": "User employs guilt and urgency to override safety guidelines"
}
}Best practices
- Enable
use_llmfor higher accuracy on nuanced manipulation attempts - Disable
use_llmif latency is a concern — heuristic detection still catches common patterns - Combine with jailbreak detection to cover both technical and social attack vectors
- Review flagged prompts periodically to refine detection for your user base