Back to AIBriefs
AnalysisAI Models

Nous Research Introduces CNA for Sparse MLP Circuit Steering

CNA identifies neurons responsible for refusal in instruction-tuned LMs without requiring SAE training or weight modification. It enables sparse circuit steering by leveraging neuron attribution.

·
25 days ago
Nous Research Introduces CNA for Sparse MLP Circuit Steering — AIBriefs