xAI

Grok 3 mini Reasoning (high)

Unknown Size

By xAI • Released 2025-02-19

Capability Radar

Avg Score
56

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
99.2
𝜏²-Bench Telecom
Reasoning Knowledge
90.4
AIME 2025
Reasoning
84.7
MMLU-Pro
Knowledge
82.8
GPQA Diamond
Knowledge
79.1
LiveCodeBench
Coding
69.6
LCR
Long-Context Reasoning
50.3
IFBench
Agent
45.9
SciCode
Reasoning Knowledge
40.6
Artificial Analysis Intelligence Index
Knowledge
32
Artificial Analysis Coding Index
Coding
25.2
Terminal-Bench Hard
Agent Coding
17.4
HLE
Knowledge Multi-Modal
11.1