Back to AIBriefs
AnalysisAI Models

ParallelKernelBench: frontier LLMs struggle with fast multi-GPU CUDA kernels

The benchmark spans 87 real-world workloads, with the best model solving under a third. However, a few generated kernels outperform any existing public implementation.

21 hours ago
ParallelKernelBench: frontier LLMs struggle with fast multi-GPU CUDA kernels — AIBriefs