AnalysisAI ModelsVisual AI
6 days ago
DRIFT: Residual Flow Adapter for VLM Continuous Outputs
Proposes DRIFT, a residual flow adapter that decodes continuous outputs in vision-language models by modeling residual prediction flows. Improves visual grounding and referring segmentation tasks, addressing limitations of discrete token decoding.
·
6 days ago