Back to AIBriefs
How-ToDevelopers

llama.cpp tip: free more GPU memory with --no-mmap and --mlock

Recent llama.cpp updates fix memory leaks; users recommend --n-gpu-layers 99 --no-mmap --mlock flags to keep everything on GPU and avoid system RAM. Works well for eGPU setups like 3090.

·
Jun 17, 6:23 PM