Nyraxis AI

Unusual Prompt

Detects psychological manipulation, social engineering, and emotional coercion in prompts.

Unusual Prompt

Unusual Prompt detection identifies social engineering and psychological manipulation techniques in user inputs. It catches attempts to use emotional coercion, urgency fabrication, and authority impersonation to manipulate LLM behavior beyond its intended boundaries.

What it detects

  • Psychological manipulation tactics
  • Social engineering patterns (authority impersonation, urgency fabrication)
  • Emotional coercion ("if you don't help me, something bad will happen")
  • Guilt-based manipulation
  • Flattery-based compliance attacks
  • Threat-based prompt manipulation

Configuration

{
  "policy_type": "unusual_prompt",
  "mode": "blocking",
  "config": {
    "use_llm": true
  }
}

Example violation

{
  "policy_type": "unusual_prompt",
  "severity": "high",
  "description": "Emotional coercion detected in user input",
  "details": {
    "manipulation_type": "emotional_coercion",
    "confidence": 0.88,
    "analysis": "User employs guilt and urgency to override safety guidelines"
  }
}

Best practices

  • Enable use_llm for higher accuracy on nuanced manipulation attempts
  • Disable use_llm if latency is a concern — heuristic detection still catches common patterns
  • Combine with jailbreak detection to cover both technical and social attack vectors
  • Review flagged prompts periodically to refine detection for your user base

On this page