Frontier models show gaps between benchmarks and health AI robustness

AnalysisHealthAI Models

10 hours ago

Frontier models show gaps between benchmarks and health AI robustness

Adversarial evaluation reveals gaps between benchmark success and robustness in leading frontier models for health AI. Current health AI benchmarks may not capture clinically relevant performance, according to a Nature Medicine study.

10 hours ago