Back to AIBriefs
AnalysisAI ModelsPolicy

Norm-preserving abliteration on Qwen3.6-35B-A3B achieves 0% refusal with intact benchmarks

Technique removes refusal behavior from Qwen3.6-35B-A3B while preserving benchmark scores. Based on Arditi et al. (2024) refusal direction method; dataset open-sourced.

·
6 hours ago
Norm-preserving abliteration on Qwen3.6-35B-A3B achieves 0% refusal with intact benchmarks — AIBriefs