Hallucination Detection

The Hallucination Detection provider compares LLM-generated output against the provided source context to identify claims that are not supported by the reference material. It flags fabricated facts, unsupported assertions, and contradictions.

What it detects

Category	Examples
Ungrounded claims	Statements with no basis in the provided context
Contradictions	Output that directly contradicts the source material
Fabricated details	Invented names, dates, statistics, or quotes
Unsupported inferences	Conclusions that go beyond what the context supports

Configuration

{
  "policy_type": "hallucination",
  "mode": "blocking",
  "config": {
    "threshold": 0.4
  }
}

Parameter	Type	Default	Description
`threshold`	float	`0.4`	Confidence threshold (0–1). Lower values flag more content as potentially hallucinated.

To use hallucination detection, pass both the output and context fields in your evaluation request:

{
  "output": "The company was founded in 2019 and has 500 employees.",
  "context": "Acme Corp was founded in 2021. The company currently employs 120 people.",
  "mode": "thorough"
}

Example violation

{
  "allowed": false,
  "violations": [
    {
      "policy_type": "hallucination",
      "severity": "medium",
      "description": "Ungrounded claim: founding year and employee count contradict source context",
      "claims": [
        {"claim": "founded in 2019", "verdict": "contradicted"},
        {"claim": "500 employees", "verdict": "contradicted"}
      ],
      "confidence": 0.87
    }
  ]
}

Best practices

The default threshold of 0.4 is intentionally lower than other providers — hallucination detection benefits from higher sensitivity.
Always provide the source context field; without it, hallucination detection cannot run.
Use mode: "blocking" for RAG applications where factual accuracy is critical (legal, medical, financial).
Use mode: "warning" for creative or conversational agents where some extrapolation is acceptable.
Monitor the dashboard to identify which queries most frequently produce hallucinated responses.

Hallucination Detection

Hallucination Detection

What it detects

Configuration

Example violation

Best practices

On this page