AnalysisAI Models
Jun 17, 7:44 AM
MiniMax releases Sparse Attention (MSA) for long-context LLMs
The method, built on Grouped Query Attention (GQA), was trained on a 109B-parameter Mixture-of-Experts model with a 3 trillion token budget. It aims to address the quadratic cost of softmax attention for ultra-long contexts in agentic workflows and code reasoning.
·
Jun 17, 7:44 AM
