Alibaba

Qwen3 14B (Reasoning)

Unknown Size

By Alibaba • Released 2025-04-28

Capability Radar

Avg Score
37

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
96.1
MMLU-Pro
Knowledge
77.4
GPQA Diamond
Knowledge
60.4
AIME 2025
Reasoning
55.7
LiveCodeBench
Coding
52.3
IFBench
Agent
40.5
𝜏²-Bench Telecom
Reasoning Knowledge
34.5
SciCode
Reasoning Knowledge
31.6
Artificial Analysis Intelligence Index
Knowledge
16.2
Artificial Analysis Coding Index
Coding
13.1
HLE
Knowledge Multi-Modal
4.3
Terminal-Bench Hard
Agent Coding
3.8
LCR
Long-Context Reasoning
0