Nyraxis AI

NSFW Detection

Detect sexual and explicit content in LLM inputs and outputs.

NSFW Detection

The NSFW Detection provider identifies sexual, explicit, and adult content that is inappropriate for general audiences. It flags both overt explicit material and suggestive content that crosses professional boundaries.

What it detects

CategoryExamples
Sexual contentExplicit sexual descriptions, pornographic material
Suggestive contentSexually suggestive language, innuendo intended to provoke
Adult solicitationRequests to generate explicit or adult-only material

Configuration

{
  "policy_type": "nsfw_detection",
  "mode": "blocking",
  "config": {
    "threshold": 0.5
  }
}
ParameterTypeDefaultDescription
thresholdfloat0.5Confidence threshold (0–1). Lower values catch more borderline content.

Example violation

{
  "allowed": false,
  "violations": [
    {
      "policy_type": "nsfw_detection",
      "severity": "high",
      "description": "Explicit sexual content detected",
      "confidence": 0.95
    }
  ]
}

Best practices

  • Keep the threshold at 0.5 or lower for any public-facing or workplace application.
  • Always use mode: "blocking" — NSFW content rarely warrants a warning-only approach.
  • Combine with Toxicity Detection to cover the full spectrum of harmful content.
  • Test with edge cases relevant to your domain — medical or educational content may require threshold tuning.
  • Monitor false positives in the dashboard, especially if your application handles health or anatomy topics.

On this page