DeepSeek

DeepSeek V3.1 Terminus (Non-reasoning)

Unknown Size

By DeepSeek • Released 2025-09-22

Capability Radar

Avg Score
43

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
83.6
GPQA Diamond
Knowledge
75.1
AIME 2025
Reasoning
53.7
LiveCodeBench
Coding
52.9
LCR
Long-Context Reasoning
43.3
IFBench
Agent
41.2
𝜏²-Bench Telecom
Reasoning Knowledge
37.1
SciCode
Reasoning Knowledge
32.1
Artificial Analysis Coding Index
Coding
31.9
Terminal-Bench Hard
Agent Coding
31.8
Artificial Analysis Intelligence Index
Knowledge
28.4
HLE
Knowledge Multi-Modal
8.4