AnalysisAI Models
4 hours ago
Opportunistic Expert Activation speeds MoE decoding without retraining

Together AI
@togethercomputeAccelerate inference, model shaping, and pre-training on a research-optimized platform.
San Francisco, CAtogether.ai

Together AI
@togethercompute
8/ Opportunistic Expert Activation: Batch-Aware Expert Routing for Faster Decode Without Retraining (OEA) https://t.co/dw33plIoxW
·
4 hours ago