- Anthropic dares you to jailbreak its new AI model Ars Technica
- Constitutional Classifiers: Defending against universal jailbreaks Anthropic
- Anthropic makes ‘jailbreak’ advance to stop AI models producing harmful results Financial Times
- Anthropic has a new way to protect large language models against jailbreaks MIT Technology Review
- Anthropic claims new AI security method blocks 95% of jailbreaks, invites red teamers to try VentureBeat
Tips, Trends, and Tools