EXO Labs runs trillion-parameter GLM 5.1 across four Mac Studios at 20 tok/s

AnalysisAI ModelsDevelopers

May 26, 5:00 PM

Featured

EXO Labs runs trillion-parameter GLM 5.1 across four Mac Studios at 20 tok/s

Alex Cheema demonstrates running a trillion-parameter model across four Mac Studios for ~$40K, achieving ~20 tokens per second. He suggests both cost and performance have about 100x improvement left.

May 26, 5:00 PM