Backed by Y Combinator
STRESS TESTING ENTERPRISE & FOUNDATIONAL AI MODELS TO FIND FAILURE MODES.
We provide a repository of stress testing, jailbreaking, and red teaming methods—a knowledge base to understand and improve the performance and safety of AI models.

FEATURED BLOGS

2025-05-06
General Analysis x Together AI
TLDR: We are excited to announce our partnership with Together AI to stress-test the safety of open-source (and closed) language models.

2025-03-21
The Jailbreak Cookbook
We have created a comprehensive overview of the most influential LLM jailbreaking methods.

2025-02-19
Generating Diverse Test Cases with Diversity Transfer from LegalBench
TLDR: we utilized LegalBench as a diversity source to enhance the diversity of our generation of red teaming questions. We show that diversity transfer from a domain-specific knowledge base is a simple and practical way to build a solid red teaming benchmark.