AnalysisAI Models
8 days ago
VistaHop benchmark tests multi-hop visual reasoning
VistaHop is a new benchmark for evaluating multimodal large reasoning models (MLRMs) on multi-hop visual reasoning in Visual DeepSearch tasks. It requires agents to inspect image regions, ground intermediate reasoning in visual evidence, and connect fine-grained clues across long contexts.
·
8 days ago