AnalysisAI Models
8 days ago
Hand trajectory fusion improves egocentric video query grounding
Paper proposes fusing hand motion cues with video appearance for temporal grounding of natural language queries in first-person video. Method outperforms prior approaches on EgoNLQ benchmark.
·
8 days ago