Alibaba

Qwen3 4B (Reasoning)

Unknown Size

By Alibaba • Released 2025-04-28

Capability Radar

Avg Score
33

Across all benchmarks

Participated
11
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
93.3
MMLU-Pro
Knowledge
69.6
GPQA Diamond
Knowledge
52.2
LiveCodeBench
Coding
46.5
IFBench
Agent
32.5
AIME 2025
Reasoning
22.3
𝜏²-Bench Telecom
Reasoning Knowledge
19
Artificial Analysis Intelligence Index
Knowledge
14.2
HLE
Knowledge Multi-Modal
5.1
SciCode
Reasoning Knowledge
3.5
LCR
Long-Context Reasoning
0