Mistral

Devstral Small 2

Unknown Size

By Mistral • Released 2025-12-09

Capability Radar

Avg Score
30

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
67.8
GPQA Diamond
Knowledge
53.2
LiveCodeBench
Coding
34.8
AIME 2025
Reasoning
34.3
IFBench
Agent
31.2
SciCode
Reasoning Knowledge
28.8
LCR
Long-Context Reasoning
24
𝜏²-Bench Telecom
Reasoning Knowledge
23.4
Artificial Analysis Coding Index
Coding
20.7
Artificial Analysis Intelligence Index
Knowledge
19.3
Terminal-Bench Hard
Agent Coding
16.7
HLE
Knowledge Multi-Modal
3.4