AnalysisAI Models
26 days ago
Grok 4.3 tops Consistency Leaderboard in LLM Sycophancy Benchmark
Grok 4.3 ranks first on the Consistency Leaderboard of the LLM Sycophancy Benchmark, measuring how often a model changes its judgment based on user input. The result is attributed to Grok 4.3's cautious behavior.
·
26 days ago
