AnalysisAI ModelsJuly 2, 2026

RL-finetuned VLMs vulnerable to weak visual perturbations

Apple study finds RL fine-tuning improves VLMs on visual reasoning benchmarks but models remain vulnerable to weak visual perturbations. The paper examines chain-of-thought consistency under such attacks.

1 source

On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMsmachinelearning.apple.com

Back to the feed