How-ToDevelopers
Jun 12, 12:01 AM
Test llama.cpp threads argument for up to 80% performance boost
A user reports up to 80% performance improvement by tuning the --threads argument in llama.cpp for hybrid CPU architectures. The tip suggests using only P-cores with taskset/affinity for optimal inference speed.
·
Jun 12, 12:01 AM