DeepSeek V3.2 (Reasoning)

Unknown Size

By DeepSeek • Released 2025-12-01

Capability Radar

Avg Score

62

Across all benchmarks

Participated

13

Benchmarks

Benchmark Performance

Benchmark	Category	Score
AIME 2025	Reasoning	92
𝜏²-Bench Telecom	Reasoning Knowledge	90.6
LiveCodeBench	Coding	86.2
MMLU-Pro	Knowledge	86.2
GPQA Diamond	Knowledge	84
LCR	Long-Context Reasoning	65
IFBench	Agent	60.7
SWE-bench (Bash Only)	Coding Agent	60
Artificial Analysis Intelligence Index	Knowledge	41.6
SciCode	Reasoning Knowledge	38.9
Artificial Analysis Coding Index	Coding	36.7
Terminal-Bench Hard	Agent Coding	35.6
HLE	Knowledge Multi-Modal	22.2