AnalysisDevelopers
8 days ago
Maybe KV cache offload to RAM isn't bad
Reddit user bobaburger shares experience with llama.cpp's `-nkvo` flag, claiming offloading KV cache to RAM is acceptable. Runs Qwen3.6 27B (IQ4_XS) and finds performance impact manageable for their hardware.
·
8 days ago