EventAI ModelsPolicyCybersecurity
Jun 10, 3:41 PM
Featured·
Cybersecurity researchers criticize Anthropic Fable guardrails
Anthropic's Fable model, a public version of Mythos, uses strict keyword-based guardrails that block cybersecurity tasks, frustrating researchers. Security expert Valentina Palmiotti noted even innocuous requests like reading a blog post trigger the safety flag, while Matt Suiche said the model falls back to Claude Opus 4.8 on guardrail hits.
