LaunchAI ModelsVisual AI
17 days ago
Nvidia releases LocateAnything for fast vision-language grounding
10x faster than Qwen3-VL using parallel box decoding. Weights, code, and demo are available on Hugging Face and GitHub.
10x faster than Qwen3-VL using parallel box decoding. Weights, code, and demo are available on Hugging Face and GitHub.