DeepSeek

DeepSeek R1 0528 Qwen3 8B

Unknown Size

By DeepSeek • Released 2025-05-29

Capability Radar

Avg Score
33

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
93.2
MMLU-Pro
Knowledge
73.9
AIME 2025
Reasoning
63.7
GPQA Diamond
Knowledge
61.2
LiveCodeBench
Coding
51.3
SciCode
Reasoning Knowledge
20.4
IFBench
Agent
19.9
Artificial Analysis Intelligence Index
Knowledge
16.4
LCR
Long-Context Reasoning
13
Artificial Analysis Coding Index
Coding
7.8
HLE
Knowledge Multi-Modal
5.6
Terminal-Bench Hard
Agent Coding
1.5
𝜏²-Bench Telecom
Reasoning Knowledge
0