Back to AIBriefs
AnalysisCybersecurity

GuardNet uses shallow neural networks to detect prompt injection and jailbreak attacks

The paper introduces GuardNet, an ensemble of shallow neural networks for detecting Prompt Injection and Jailbreak attacks on LLMs. Benchmark evaluations may be affected by contamination and partial information.

·
6 days ago
GuardNet uses shallow neural networks to detect prompt injection and jailbreak attacks — AIBriefs