DeepSeek

DeepSeek R1 Distill Llama 70B

Unknown Size

By DeepSeek • Released 2025-01-20

Capability Radar

Avg Score
32

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
93.5
MMLU-Pro
Knowledge
79.5
AIME 2025
Reasoning
53.7
GPQA Diamond
Knowledge
40.2
SciCode
Reasoning Knowledge
31.2
IFBench
Agent
27.6
LiveCodeBench
Coding
26.6
𝜏²-Bench Telecom
Reasoning Knowledge
21.9
Artificial Analysis Intelligence Index
Knowledge
16
Artificial Analysis Coding Index
Coding
11.4
LCR
Long-Context Reasoning
11
HLE
Knowledge Multi-Modal
6.1
Terminal-Bench Hard
Agent Coding
1.5