Back to AIBriefs
AnalysisAI Models

Papers analyze SAE reliability, sparsity, and propose new variants

Several new arxiv papers examine sparse autoencoder (SAE) reliability, sparsity effects, and propose improvements. 'Rational Sparse Autoencoder' learns sparsity mechanisms, while 'Cosine-Scored SAEs' address norm inflation. 'SAE Interventions are Unreliable' warns that suppressing features does not prevent behavior recovery.

Papers analyze SAE reliability, sparsity, and propose new variants — AIBriefs