Spec-Driven Testing for Agents via Poem Jailbreaks

AnalysisPolicyAI Agents

10 days ago

Featured

Spec-Driven Testing for Agents via Poem Jailbreaks

Wrapping a malicious instruction in a poem is an effective jailbreak against large models but not small ones. Steven Willmott argues this shows larger models aren't straightforwardly better.

10 days ago