Vigil — LLM prompt injection & jailbreak detection

Vigil is a Python library and REST API for scanning LLM prompts and responses for prompt injection, jailbreaks, and other risky inputs before they reach your model. It layers several independent detection scanners so no single technique becomes a blind spot.

Key features

Ensemble scanners: vector-database similarity to known attacks, a transformer classifier, YARA/heuristic rules, prompt-response relevance, and canary-token leak detection
Ships curated embeddings and signatures for documented prompt-injection and jailbreak techniques
Runs as an embeddable library or a standalone REST API service
Configurable per-scanner thresholds and pluggable custom detectors
Local-first: works with self-hosted embedding models, so prompt data never leaves your stack

Vigil sits in front of any LLM as an input/output firewall, giving agent builders an auditable guardrail layer that flags adversarial inputs instead of silently passing them through.

Curated mirror of the open-source Vigil (Apache-2.0). Get it from the source.

Vigil

Vigil — LLM prompt injection & jailbreak detection

Key features

More from @ai-supply