AnalysisAI Models
6 days ago
ViCuR: Visual Cues as Recoverable Privilege for Multimodal On-Policy Distillation
ViCuR uses visual cues as a recoverable privilege signal to improve multimodal on-policy distillation. The method trains a student on its own policy trajectories under teacher supervision, enhancing reasoning performance.
·
6 days ago