Mistral

Devstral Medium

Unknown Size

By Mistral • Released 2025-07-10

Capability Radar

Avg Score
30

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
70.8
MATH-500
Reasoning
70.7
GPQA Diamond
Knowledge
49.2
LiveCodeBench
Coding
33.7
IFBench
Agent
29.9
SciCode
Reasoning Knowledge
29.4
LCR
Long-Context Reasoning
28.7
𝜏²-Bench Telecom
Reasoning Knowledge
19.9
Artificial Analysis Intelligence Index
Knowledge
18.6
Artificial Analysis Coding Index
Coding
15.9
Terminal-Bench Hard
Agent Coding
9.1
AIME 2025
Reasoning
4.7
HLE
Knowledge Multi-Modal
3.8