AnalysisAI ModelsDevelopers
5 days ago
OpenAI, Google, Anthropic pivot to inference-time compute
Inference-time compute allows models to allocate more compute at query time, shifting away from scaling base model size. OpenAI released o1 and o3, Google shipped Gemini 2.0 Flash Thinking, and Anthropic added extended thinking to Claude 3.7 Sonnet. Cerebras explains the disaggregated inference approach in a new video.
