AnalysisAI Models
6 days ago
M2S-AVSR improves robust audio-visual speech recognition
M2S-AVSR introduces modality-aware multi-view self-supervised representation for robust audio-visual speech recognition, addressing challenges like viewpoint variation, audio distortion, and visual occlusion. The method leverages visual cues to enhance robustness in real-world scenarios.
·
6 days ago