AnalysisAI Models
17 days ago
Enthusiast runs 1-trillion-parameter LLM on single GPU with 768GB Optane memory
A user achieved ~4 tokens/second running Kimi K2.5 locally using 768GB of cheap Intel Optane DIMMs with a single GPU. The setup demonstrates a cost-effective way to run massive models via memory expansion.
