Back to AIBriefs
AnalysisDevelopers

llama.cpp PR optimizes Top-N-Sigma sampler by removing softmax+sort

TimNN's PR #22645 removes unnecessary softmax+sort in Top-N-Sigma sampling, improving t/s on M3 Max MacBook Pro. The change prevents wasted computation when Top-N-Sigma is followed by Dist sampling.

··Discuss
14 hours ago
llama.cpp PR optimizes Top-N-Sigma sampler by removing softmax+sort — AIBriefs