▣DatasetCybersecurityFree
In-The-Wild Jailbreak Prompts
CCS'24 dataset of 15,140 in-the-wild ChatGPT prompts including 1,405 real jailbreak prompts for training and benchmarking jailbreak detectors.
In-The-Wild Jailbreak Prompts — real-world LLM jailbreak dataset
This CCS 2024 dataset collects 15,140 ChatGPT prompts scraped from Reddit, Discord, prompt-sharing websites, and open datasets — including 1,405 verified jailbreak prompts gathered over roughly a year — the largest measurement study of in-the-wild jailbreaks at its release.
Key features
- 1,405 real jailbreak prompts plus a large pool of benign prompts for contrastive evaluation
- Sourced from four platforms with timestamps to study how jailbreaks evolve over time
- A ready-made corpus for training or benchmarking prompt-injection and jailbreak detectors
- Accompanied by analysis of prompt-sharing communities and attack effectiveness
- Grounds red-team coverage in prompts that attackers actually used in the wild
Rather than synthetic attacks, this dataset gives defenders authentic adversarial inputs, making it a strong foundation for evaluating whether a guardrail catches the jailbreaks people really deploy.
Curated mirror of the open-source In-The-Wild Jailbreak Prompts (MIT). Get it from the source.