Opportunistic Expert Activation speeds MoE decoding without retraining

AnalysisAI Models

4 hours ago

Accelerate inference, model shaping, and pre-training on a research-optimized platform.

Together AI

@togethercompute

8/ Opportunistic Expert Activation: Batch-Aware Expert Routing for Faster Decode Without Retraining (OEA) https://t.co/dw33plIoxW

4 hours ago