Mistral

Devstral 2

Unknown Size

By Mistral • Released 2025-12-09

Capability Radar

Avg Score
36

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
76.2
GPQA Diamond
Knowledge
59.4
SWE-bench (Bash Only)
Coding Agent
53.8
LiveCodeBench
Coding
44.8
IFBench
Agent
38.1
AIME 2025
Reasoning
36.7
SciCode
Reasoning Knowledge
33.1
LCR
Long-Context Reasoning
30
𝜏²-Bench Telecom
Reasoning Knowledge
24.9
Artificial Analysis Coding Index
Coding
23.7
Artificial Analysis Intelligence Index
Knowledge
22
Terminal-Bench Hard
Agent Coding
18.9
HLE
Knowledge Multi-Modal
3.6