AnalysisAI Models
Jun 15, 4:00 AM
New papers examine weight norm's role in grokking delay
Three new theoretical papers investigate how weight norm controls grokking's delayed generalization. One paper identifies a critical norm threshold, while another shows that weight norm alone sets the grokking timescale via a causal delay law.