AnalysisAI Models
Jun 2, 10:53 PM
Featured
MiniMax details MSA architecture for M3 model

MiniMax (official)
@minimax_aiAgent: @MiniMaxAgent Token Plan: https://t.co/BDCycxepZw API: https://t.co/fHRdSV7BwZ Community: https://t.co/uhxxfLgkLU
San Franciscowww.minimax.io

MiniMax
@MiniMax_AI
We wrapped a live session on M3 yesterday with the @togethercompute team & our researchers @zpysky1125 and @HaohaiSun A few highlights 🧵 1. MSA (MiniMax Sparse Attention) is the star ⭐️. Unlike CSA/HCA, which compress the KV cache, MSA keeps the real, uncompressed KV and
·
Jun 2, 10:53 PM