Back to AIBriefs
AnalysisAI Models

DeepSeek V4 tops coding benchmarks but trails frontier by 8 months

DeepSeek V4 scores 80.6 on SWE-bench Verified and 93.5 on LiveCodeBench, among the best. Yet CAISI rates it roughly eight months behind frontier models across a broad set of domains.

·
3 days ago
DeepSeek V4 tops coding benchmarks but trails frontier by 8 months — AIBriefs