Mistral

Devstral Small (May '25)

Unknown Size

By Mistral • Released 2025-05-21

Capability Radar

Avg Score
30

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
68.4
MMLU-Pro
Knowledge
63.2
GPQA Diamond
Knowledge
43.4
𝜏²-Bench Telecom
Reasoning Knowledge
38
IFBench
Agent
31.6
LCR
Long-Context Reasoning
26.7
LiveCodeBench
Coding
25.8
SciCode
Reasoning Knowledge
24.5
Artificial Analysis Intelligence Index
Knowledge
18
Artificial Analysis Coding Index
Coding
12.2
Terminal-Bench Hard
Agent Coding
6.1
HLE
Knowledge Multi-Modal
4