AnalysisAI Models
8 days ago
Smaller Self-Supervised ViTs Localize Better than Larger Ones
A new arXiv study finds that smaller self-supervised Vision Transformers produce better foreground object localization than larger models. The paper attributes this to differences in attention map dynamics during training.
·
8 days ago