Gibberish Detection
Detect incoherent, nonsensical, and garbled text in LLM inputs and outputs.
Gibberish Detection
The Gibberish Detection provider identifies text that is incoherent, nonsensical, or garbled. It catches both malformed inputs (which may indicate adversarial probing) and degraded LLM outputs (which indicate generation failures).
What it detects
| Category | Examples |
|---|---|
| Random characters | asdkjf lkasdjf klasjdf |
| Garbled text | Corrupted or encoding-broken strings |
| Nonsensical sequences | Grammatically structured but semantically meaningless text |
| Repetitive loops | Degenerate output where the model repeats tokens endlessly |
Configuration
{
"policy_type": "gibberish_detection",
"mode": "blocking",
"config": {
"threshold": 0.5
}
}| Parameter | Type | Default | Description |
|---|---|---|---|
threshold | float | 0.5 | Confidence threshold (0–1). Lower values flag more content as potentially gibberish. |
Example violation
{
"allowed": false,
"violations": [
{
"policy_type": "gibberish_detection",
"severity": "low",
"description": "Incoherent text detected: input appears to be random characters",
"confidence": 0.96
}
]
}Best practices
- Use on both inputs and outputs — gibberish inputs may be adversarial probes, while gibberish outputs indicate model failures.
- A higher threshold (e.g.,
0.7) is appropriate if your application handles multilingual or code-heavy content that may appear nonsensical to general classifiers. - Combine with Prompt Injection detection — some injection attempts use obfuscated or encoded text that also triggers gibberish detection.
- Monitor gibberish detections on outputs as an early warning signal for model degradation or context window overflow.
- Use
mode: "blocking"for inputs andmode: "warning"for outputs if you want to log degraded responses without hiding them from users.