Alibaba

Qwen2.5 Instruct 72B

Unknown Size

By Alibaba • Released 2024-09-19

Capability Radar

Avg Score
31

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
85.8
MMLU-Pro
Knowledge
72
GPQA Diamond
Knowledge
49.1
IFBench
Agent
36.9
𝜏²-Bench Telecom
Reasoning Knowledge
34.5
LiveCodeBench
Coding
27.6
SciCode
Reasoning Knowledge
26.7
LCR
Long-Context Reasoning
20.3
Artificial Analysis Intelligence Index
Knowledge
15.6
AIME 2025
Reasoning
14
Artificial Analysis Coding Index
Coding
11.9
Terminal-Bench Hard
Agent Coding
4.5
HLE
Knowledge Multi-Modal
4.2