xAI

Grok 4

Unknown Size

By xAI • Released 2025-07-10

Capability Radar

Avg Score
64

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
99
AIME 2025
Reasoning
92.7
GPQA Diamond
Knowledge
87.7
MMLU-Pro
Knowledge
86.6
LiveCodeBench
Coding
81.9
𝜏²-Bench Telecom
Reasoning Knowledge
74.9
LCR
Long-Context Reasoning
68
IFBench
Agent
53.7
SciCode
Reasoning Knowledge
45.7
Artificial Analysis Intelligence Index
Knowledge
41.4
Artificial Analysis Coding Index
Coding
40.5
Terminal-Bench Hard
Agent Coding
37.9
HLE
Knowledge Multi-Modal
23.9