Alibaba

Qwen3 8B (Non-reasoning)

Unknown Size

By Alibaba • Released 2025-04-28

Capability Radar

Avg Score
25

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
82.8
MMLU-Pro
Knowledge
64.3
GPQA Diamond
Knowledge
45.2
IFBench
Agent
28.6
𝜏²-Bench Telecom
Reasoning Knowledge
24.9
AIME 2025
Reasoning
24.3
LiveCodeBench
Coding
20.2
SciCode
Reasoning Knowledge
16.8
Artificial Analysis Intelligence Index
Knowledge
10.6
Artificial Analysis Coding Index
Coding
7.1
HLE
Knowledge Multi-Modal
2.8
Terminal-Bench Hard
Agent Coding
2.3
LCR
Long-Context Reasoning
0