DeepSeek V3.1 (Non-reasoning)

Unknown Size

By DeepSeek • Released 2025-08-21

Capability Radar

Avg Score

42

Across all benchmarks

Participated

12

Benchmarks

Benchmark Performance

Benchmark	Category	Score
MMLU-Pro	Knowledge	83.3
GPQA Diamond	Knowledge	73.5
LiveCodeBench	Coding	57.7
AIME 2025	Reasoning	49.7
LCR	Long-Context Reasoning	45
IFBench	Agent	37.8
SciCode	Reasoning Knowledge	36.7
𝜏²-Bench Telecom	Reasoning Knowledge	34.8
Artificial Analysis Coding Index	Coding	28.4
Artificial Analysis Intelligence Index	Knowledge	28
Terminal-Bench Hard	Agent Coding	24.2
HLE	Knowledge Multi-Modal	6.3