AnalysisAI ModelsJuly 3, 2026

Hobbyist builds 448GB VRAM rig for MiniMax M3 AWQ-INT4

A Reddit user assembled a rig with 2x RTX Pro 6000 Max-Q, 8x RTX 3090, and 2x RTX 5090 totaling 448GB VRAM. They run MiniMax M3 in AWQ-INT4 on vLLM with tensor parallelism, achieving ~30 tokens/second.

1 source

Uh.. Honey, how do you feel about takeout?reddit.com

Back to the feed