AnalysisAI Models
7 days ago
Spectral scaling laws of Muon optimizer
Paper derives spectral scaling laws for Muon, the orthonormalizer optimizer used in recent open-source LLMs. The analysis reveals how Muon's update rule affects training dynamics across model scales.