K Koda Intelligence
trophyLeaderboard
LIVE · 03 BOARDS

The AI race, ranked.

3 leaderboards. 54 contenders. Updated weekly from the original sources. Who's winning right now.

// 01 / CHAMPIONS

This Week's Champions

emoji_events
General
Chatbot Arena
claude-fable-5
1510
Anthropic
View full board arrow_outward
emoji_events
Reasoning
LiveBench
GPT-5.5 Thinking xHigh Effort
80.7
OpenAI
View full board arrow_outward
emoji_events
Coding
SWE-Bench Verified
Claude 4.5 Opus (high reasoning)
76.8
Anthropic
View full board arrow_outward
Benchmarks
0
Rankings
0
Providers
0
Last Scanned
Jun 14, 2026
// 02 / BOARDS

All Leaderboards

publicChatbot Arena
General

Human preference rankings from blind A/B comparisons (5.7M+ votes)

SOURCE ↗
🥇
claude-fable-5
Anthropic
1510
🥈
claude-opus-4-6-thinking
Anthropic
1504
🥉
claude-opus-4-7-thinking
Anthropic
1502
4
claude-opus-4-6
Anthropic
1498
5
claude-opus-4-7
Anthropic
1492
6
muse-spark
Meta
1487
7
gemini-3.1-pro-preview
Google
1487
8
gemini-3-pro
Google
1486
9
claude-opus-4-8-thinking
Anthropic
1486
10
gpt-5.5-high
OpenAI
1481
11
gpt-5.4-high
OpenAI
1479
12
claude-opus-4-8
Anthropic
1477
13
gemini-3.5-flash
Google
1477
14
gpt-5.2-chat-latest-20260210
OpenAI
1475
15
glm-5.1
Z.ai
1475
16
qwen3.7-max-preview
Alibaba
1474
17
grok-4.20-beta1
xAI
1474
18
gpt-5.5
OpenAI
1474
19
grok-4.20-beta-0309-reasoning
xAI
1474
20
gemini-3-flash
Google
1473
psychologyLiveBench
Reasoning

Contamination-free benchmark with monthly-refreshed questions

SOURCE ↗
🥇
GPT-5.5 Thinking xHigh Effort
OpenAI
80.7
🥈
GPT-5.4 Thinking xHigh Effort
OpenAI
80.3
🥉
Gemini 3.1 Pro Preview High*
Google
79.9
4
Claude Fable 5 Thinking xHigh Effort
Anthropic
78.3
5
Claude 4.8 Opus Thinking xHigh Effort
Anthropic
77.2
6
Claude 4.7 Opus Thinking xHigh Effort
Anthropic
76.9
7
Claude 4.6 Opus Thinking High Effort
Anthropic
76.3
8
Claude 4.5 Opus Thinking High Effort
Anthropic
76.0
9
Claude 4.6 Sonnet Thinking Medium Effort
Anthropic
75.5
10
Gemini 3.5 Flash High
Google
75.0
11
GPT-5.2 High
OpenAI
74.8
12
GPT-5.2 Codex
OpenAI
74.3
13
Qwen 3.7 Max
Alibaba
74.3
14
GPT-5.1 Codex Max High
OpenAI
74.0
15
DeepSeek V4 Pro
DeepSeek
73.6
16
Gemini 3 Pro Preview High
Google
73.4
17
GPT-5.3 Codex High
OpenAI
72.8
18
Gemini 3 Flash Preview High
Google
72.4
19
Kimi K2.6 Thinking
Moonshot AI
72.2
20
GPT-5.1 High
OpenAI
72.0
terminalSWE-Bench Verified
Coding

Real-world software engineering task completion

SOURCE ↗
🥇
Claude 4.5 Opus (high reasoning)
Anthropic
76.8
🥈
Gemini 3 Flash (high reasoning)
Google
75.8
🥉
MiniMax M2.5 (high reasoning)
Minimax
75.8
4
Claude Opus 4.6
Anthropic
75.6
5
GPT-5-2 Codex
OpenAI
72.8
6
GLM-5 (high reasoning)
ZAI
72.8
7
GPT-5-2 (high reasoning)
OpenAI
72.8
8
GPT 5.2 Codex
OpenAI
72.8
9
Claude 4.5 Sonnet (high reasoning)
Anthropic
71.4
10
Kimi K2.5 (high reasoning)
Kimi
70.8
11
DeepSeek V3.2 (high reasoning)
DeepSeek
70
12
Gemini 3 Pro
Google
69.6
13
Claude 4.5 Haiku (high reasoning)
Anthropic
66.6
14
GPT-5 Mini
OpenAI
56.2
// 03 / POWER INDEX

Provider Power Index

Sum of placement points across every tracked leaderboard. Rank 1 of a 20-model board scores 20; rank 20 scores 1. Higher = broader dominance.

Anthropic
231
OpenAI
127
Google
89
Meta
15
Alibaba
13
Minimax
12
DeepSeek
10
ZAI
9
Z.ai
6
xAI
6
Kimi
5
Moonshot AI
2

Boards are updated weekly from each source. Cadences differ: Chatbot Arena updates hourly, LiveBench monthly, SWE-Bench Verified as new submissions are accepted. Scores are presented as published — no normalization across boards.

Like what you see?

Get tomorrow's brief delivered to your inbox.

One email per day. Unsubscribe anytime.