Alibaba

Qwen3 8B (Reasoning)

Unknown Size

By Alibaba • Released 2025-04-28

Capability Radar

Avg Score
30

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
90.4
MMLU-Pro
Knowledge
74.3
GPQA Diamond
Knowledge
58.9
LiveCodeBench
Coding
40.6
IFBench
Agent
33.5
𝜏²-Bench Telecom
Reasoning Knowledge
27.8
SciCode
Reasoning Knowledge
22.6
AIME 2025
Reasoning
19
Artificial Analysis Intelligence Index
Knowledge
13.1
Artificial Analysis Coding Index
Coding
9
HLE
Knowledge Multi-Modal
4.2
Terminal-Bench Hard
Agent Coding
2.3
LCR
Long-Context Reasoning
0