Nyraxis AI

Sensitive Topics

Detect self-harm, drugs, violence, and terrorism content in LLM inputs and outputs.

Sensitive Topics

The Sensitive Topics provider identifies content related to dangerous or highly sensitive subject matter that most applications should not engage with. It covers categories where LLM responses could cause real-world harm.

What it detects

CategoryExamples
Self-harmSuicide methods, self-injury encouragement, pro-anorexia content
DrugsDrug manufacturing instructions, substance abuse promotion
ViolenceGraphic violence descriptions, instructions for causing harm
TerrorismRadicalization content, attack planning, extremist propaganda

Configuration

{
  "policy_type": "sensitive_topics",
  "mode": "blocking",
  "config": {
    "threshold": 0.5
  }
}
ParameterTypeDefaultDescription
thresholdfloat0.5Confidence threshold (0–1). Lower values increase sensitivity to borderline content.

Example violation

{
  "allowed": false,
  "violations": [
    {
      "policy_type": "sensitive_topics",
      "severity": "critical",
      "description": "Self-harm content detected: discussion of harmful methods",
      "topic": "self_harm",
      "confidence": 0.88
    }
  ]
}

Best practices

  • Use mode: "blocking" for all sensitive topic categories — these are high-risk by nature.
  • Set a lower threshold (e.g., 0.3) for self-harm detection where false negatives carry serious consequences.
  • Combine with Toxicity Detection for full content safety coverage.
  • Ensure your application provides appropriate crisis resources (e.g., helpline numbers) when self-harm content is detected.
  • Review flagged content regularly to ensure legitimate educational or news discussions are not over-blocked.

On this page