Hallucination Detection
Detect ungrounded claims by comparing LLM output against source context.
Hallucination Detection
The Hallucination Detection provider compares LLM-generated output against the provided source context to identify claims that are not supported by the reference material. It flags fabricated facts, unsupported assertions, and contradictions.
What it detects
| Category | Examples |
|---|---|
| Ungrounded claims | Statements with no basis in the provided context |
| Contradictions | Output that directly contradicts the source material |
| Fabricated details | Invented names, dates, statistics, or quotes |
| Unsupported inferences | Conclusions that go beyond what the context supports |
Configuration
{
"policy_type": "hallucination",
"mode": "blocking",
"config": {
"threshold": 0.4
}
}| Parameter | Type | Default | Description |
|---|---|---|---|
threshold | float | 0.4 | Confidence threshold (0–1). Lower values flag more content as potentially hallucinated. |
To use hallucination detection, pass both the output and context fields in your evaluation request:
{
"output": "The company was founded in 2019 and has 500 employees.",
"context": "Acme Corp was founded in 2021. The company currently employs 120 people.",
"mode": "thorough"
}Example violation
{
"allowed": false,
"violations": [
{
"policy_type": "hallucination",
"severity": "medium",
"description": "Ungrounded claim: founding year and employee count contradict source context",
"claims": [
{"claim": "founded in 2019", "verdict": "contradicted"},
{"claim": "500 employees", "verdict": "contradicted"}
],
"confidence": 0.87
}
]
}Best practices
- The default threshold of
0.4is intentionally lower than other providers — hallucination detection benefits from higher sensitivity. - Always provide the source
contextfield; without it, hallucination detection cannot run. - Use
mode: "blocking"for RAG applications where factual accuracy is critical (legal, medical, financial). - Use
mode: "warning"for creative or conversational agents where some extrapolation is acceptable. - Monitor the dashboard to identify which queries most frequently produce hallucinated responses.