AnalysisCybersecurityAI Agents
2 hours ago
MosaicLeaks: New benchmark tests if research agents keep secrets
MosaicLeaks is a new benchmark for measuring whether AI research agents inadvertently leak sensitive information. The assessment includes tasks requiring agents to work with confidential data while maintaining secrecy.
2 hours ago