DeepSeek

DeepSeek V3 (Dec '24)

Unknown Size

By DeepSeek • Released 2024-12-26

Capability Radar

Avg Score
34

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
88.7
MMLU-Pro
Knowledge
75.2
GPQA Diamond
Knowledge
55.7
LiveCodeBench
Coding
35.9
SciCode
Reasoning Knowledge
35.4
IFBench
Agent
34.8
LCR
Long-Context Reasoning
29
AIME 2025
Reasoning
26
𝜏²-Bench Telecom
Reasoning Knowledge
22.8
Artificial Analysis Coding Index
Coding
16.4
Artificial Analysis Intelligence Index
Knowledge
16.4
Terminal-Bench Hard
Agent Coding
6.8
HLE
Knowledge Multi-Modal
3.6