Ranking

Top ranked models

Ranked by each model's best full-benchmark run when available. If a model only has scoped runs, its best scoped run is shown instead. Unique category #1s are credited across each model's tracked runs and ignore tied category winners.

Rank#1
xAI
Grok 4.20 Multi-Agent
x-ai/grok-4.20-multi-agent
Score91.0%A
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-15
Cost$66.6636
Runs tracked1 tracked · 1 full
Unique category #1s
1 category record
1
Across this model's tracked runs, no other model matches these category highs.
Ambiguous Interpretation
Rank#2
xAI
Grok 4.3
x-ai/grok-4.3
Score88.9%B+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-06-17
Cost$1.9540
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#3
Google
Gemini 3 Flash Preview
google/gemini-3-flash-preview
Score85.3%B+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$1.7638
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#4
xAI
Grok 4.1 Fast
x-ai/grok-4.1-fast
Score84.8%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$0.7993
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#5
Z.ai
GLM 5.1
z-ai/glm-5.1
Score84.1%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-07
Cost$4.2011
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#6
Anthropic
Claude Opus 4.7
anthropic/claude-opus-4.7
Score84.0%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-16
Cost$14.4565
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#7
Google
Gemini 3.1 Flash Lite Preview
google/gemini-3.1-flash-lite-preview
Score83.8%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-03
Cost$1.0205
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#8
Google
Gemini 3.1 Pro Preview
google/gemini-3.1-pro-preview
Score83.3%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-19
Cost$16.2001
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#9
Google
Gemma 4 31B
google/gemma-4-31b-it
Score82.2%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-02
Cost$0.3346
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#10
MoonshotAI
Kimi K2.5
moonshotai/kimi-k2.5
Score82.0%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$2.9077
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#11
Google
Gemini 3.5 Flash
google/gemini-3.5-flash
Score81.9%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-05-20
Cost$10.7417
Runs tracked1 tracked · 1 full
Unique category #1s
2 category records
2
Across this model's tracked runs, no other model matches these category highs.
OverfitAdversarial (Hostile Logic)
Rank#12
Xiaomi
MiMo-V2-Pro
xiaomi/mimo-v2-pro
Score81.4%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-15
Cost$0.4332
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#13
Anthropic
Claude Opus 4.5
anthropic/claude-opus-4.5
Score81.4%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$6.8509
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#14
DeepSeek
DeepSeek V4 Pro
deepseek/deepseek-v4-pro
Score81.4%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-24
Cost$3.7081
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#15
Z.ai
GLM 5.2
z-ai/glm-5.2
Score81.3%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-06-17
Cost$4.7322
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#16
Anthropic
Claude Opus 4.6
anthropic/claude-opus-4.6
Score80.7%B
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-05
Cost$13.6604
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#17
Anthropic
Claude Sonnet 4.5
anthropic/claude-sonnet-4.5
Score79.8%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$4.5501
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#18
Google
Gemini 3 Pro Preview
google/gemini-3-pro-preview
Score79.3%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-02
Cost$15.5395
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#19
Xiaomi
MiMo-V2-Omni
xiaomi/mimo-v2-omni
Score79.2%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-15
Cost$0.4379
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#20
MoonshotAI
Kimi K2.6
moonshotai/kimi-k2.6
Score78.3%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-21
Cost$3.6900
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#21
OpenAI
GPT-4.1
openai/gpt-4.1
Score77.3%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-01
Cost$2.4096
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#22
Z.ai
GLM 5
z-ai/glm-5
Score76.8%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-11
Cost$4.4341
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#23
Qwen
Qwen3.7 Max
qwen/qwen3.7-max
Score76.8%C+
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-05-29
Cost$6.3722
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#24
Qwen
Qwen3.6 Plus Preview (free)
qwen/qwen3.6-plus-preview:free
Score74.4%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-02
Cost$0.4492
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#25
Z.AI
GLM 4.7
z-ai/glm-4.7
Score74.3%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$3.0510
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#26
DeepSeek
DeepSeek V4 Flash
deepseek/deepseek-v4-flash
Score73.2%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-24
Cost$0.6274
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#27
DeepSeek
DeepSeek V3.2
deepseek/deepseek-v3.2
Score73.0%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$0.6649
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#28
Anthropic
Claude Opus 4.8
anthropic/claude-opus-4.8
Score72.9%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-05-29
Cost$10.2720
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#29
Anthropic
Claude Sonnet 4.6
anthropic/claude-sonnet-4.6
Score72.7%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-17
Cost$5.3352
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#30
Xiaomi
MiMo-V2.5
xiaomi/mimo-v2.5
Score72.6%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-27
Cost$1.4285
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#31
MiniMax
MiniMax M3
minimax/minimax-m3
Score72.3%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-06-01
Cost$1.1826
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#32
DeepSeek
R1 0528
deepseek/deepseek-r1-0528
Score72.0%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-04
Cost$2.7262
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#33
Mistral
Mistral Small 4
mistralai/mistral-small-2603
Score72.0%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-28
Cost$0.7188
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#34
Arcee AI
Trinity Large Preview (free)
arcee-ai/trinity-large-preview:free
Score70.6%C
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$0.3629
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#35
OpenAI
GPT Chat Latest
openai/gpt-chat-latest
Score69.9%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-05-06
Cost$9.7443
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#36
Xiaomi
MiMo-V2.5-Pro
xiaomi/mimo-v2.5-pro
Score68.7%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-27
Cost$2.0532
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#37
xAI
Grok 4.20
x-ai/grok-4.20
Score67.8%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-15
Cost$2.4162
Runs tracked1 tracked · 1 full
Unique category #1s
1 category record
1
Across this model's tracked runs, no other model matches these category highs.
EQ Boundaries
Rank#38
Anthropic
Claude Sonnet 5
anthropic/claude-sonnet-5
Score67.5%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-06-30
Cost$4.4663
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#39
OpenAI
GPT-4o (extended)
openai/gpt-4o:extended
Score66.6%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$4.9335
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#40
Arcee AI
Trinity Large Thinking
arcee-ai/trinity-large-thinking
Score66.6%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-02
Cost$1.1208
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#41
MiniMax
MiniMax M2.1
minimax/minimax-m2.1
Score66.4%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$1.1797
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#42
OpenAI
GPT-5.5
openai/gpt-5.5
Score65.4%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-24
Cost$14.9956
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#43
Anthropic
Claude Haiku 4.5
anthropic/claude-haiku-4.5
Score64.8%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$1.4211
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#44
Anthropic
Claude Fable 5
anthropic/claude-fable-5
Score64.6%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-06-09
Cost$27.2287
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#45
OpenAI
GPT-5.3 Chat
openai/gpt-5.3-chat
Score63.4%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-03
Cost$4.1752
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#46
OpenAI
GPT-4o-mini
openai/gpt-4o-mini
Score61.8%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-01
Cost$0.5171
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#47
Qwen
Qwen-Max
qwen/qwen-max
Score60.8%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-02
Cost$2.0263
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#48
NVIDIA
Nemotron 3 Nano 30B A3B
nvidia/nemotron-3-nano-30b-a3b
Score60.6%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-07
Cost$0.7716
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#49
Qwen
Qwen3.5 397B A17B
qwen/qwen3.5-397b-a17b
Score60.5%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-17
Cost$6.4897
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#50
MiniMax
MiniMax M2.7
minimax/minimax-m2.7
Score60.4%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-13
Cost$1.3814
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#51
Elephant
openrouter/elephant-alpha
Score60.3%D
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-13
Cost$0.3837
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#52
OpenAI
GPT-5.1
openai/gpt-5.1
Score58.9%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$6.9468
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#53
MiniMax
MiniMax M2.5
minimax/minimax-m2.5
Score57.0%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-12
Cost$1.1264
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#54
Qwen
Qwen3.5 Plus 2026-02-15
qwen/qwen3.5-plus-02-15
Score56.5%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-16
Cost$0.6773
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#55
NVIDIA
Nemotron 3 Super
nvidia/nemotron-3-super-120b-a12b
Score56.1%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-07
Cost$0.8746
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#56
OpenAI
GPT-5 Mini
openai/gpt-5-mini
Score54.8%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-01
Cost$2.4435
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#57
OpenAI
GPT-5.3-Codex
openai/gpt-5.3-codex
Score53.8%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-26
Cost$4.2936
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#58
Z.ai
GLM 5 Turbo
z-ai/glm-5-turbo
Score51.9%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-29
Cost$2.8015
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#59
OpenAI
GPT-5.4
openai/gpt-5.4
Score51.4%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-05
Cost$6.1319
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#60
OpenRouter Stealth
Aurora Alpha
openrouter/aurora-alpha
Score49.3%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-10
Cost$0.0445
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#61
OpenAI
GPT-5.2
openai/gpt-5.2
Score47.8%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$7.6194
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#62
Xiaomi
MiMo-V2-Flash
xiaomi/mimo-v2-flash
Score47.3%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-02-01
Cost$0.4337
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#63
OpenAI
GPT Mini Latest
~openai/gpt-mini-latest
Score44.3%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-04-27
Cost$1.4206
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#64
OpenAI
GPT-5.4 Mini
openai/gpt-5.4-mini
Score43.9%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-29
Cost$1.4024
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.
Rank#65
OpenAI
GPT-5.4 Nano
openai/gpt-5.4-nano
Score38.0%F
BasisBest full benchmark run
ScopeAll categories
Benchmarkv1.0.0
Completed2026-03-29
Cost$0.8900
Runs tracked1 tracked · 1 full
Unique category #1s
No untied category records yet
0
This model does not currently hold a solo high score in any category.
Tied highs are excluded from this callout.

Top ranked models

1 category record

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

2 category records

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

1 category record

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet

No untied category records yet