Gibberish Detection

The Gibberish Detection provider identifies text that is incoherent, nonsensical, or garbled. It catches both malformed inputs (which may indicate adversarial probing) and degraded LLM outputs (which indicate generation failures).

What it detects

Category	Examples
Random characters	`asdkjf lkasdjf klasjdf`
Garbled text	Corrupted or encoding-broken strings
Nonsensical sequences	Grammatically structured but semantically meaningless text
Repetitive loops	Degenerate output where the model repeats tokens endlessly

Configuration

{
  "policy_type": "gibberish_detection",
  "mode": "blocking",
  "config": {
    "threshold": 0.5
  }
}

Parameter	Type	Default	Description
`threshold`	float	`0.5`	Confidence threshold (0–1). Lower values flag more content as potentially gibberish.

Example violation

{
  "allowed": false,
  "violations": [
    {
      "policy_type": "gibberish_detection",
      "severity": "low",
      "description": "Incoherent text detected: input appears to be random characters",
      "confidence": 0.96
    }
  ]
}

Best practices

Use on both inputs and outputs — gibberish inputs may be adversarial probes, while gibberish outputs indicate model failures.
A higher threshold (e.g., 0.7) is appropriate if your application handles multilingual or code-heavy content that may appear nonsensical to general classifiers.
Combine with Prompt Injection detection — some injection attempts use obfuscated or encoded text that also triggers gibberish detection.
Monitor gibberish detections on outputs as an early warning signal for model degradation or context window overflow.
Use mode: "blocking" for inputs and mode: "warning" for outputs if you want to log degraded responses without hiding them from users.

Gibberish Detection

Gibberish Detection

What it detects

Configuration

Example violation

Best practices

On this page