Alibaba

Qwen3 VL 4B (Reasoning)

Unknown Size

By Alibaba • Released 2025-10-14

Capability Radar

Avg Score
25

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
70
GPQA Diamond
Knowledge
49.4
IFBench
Agent
36.6
LiveCodeBench
Coding
32
AIME 2025
Reasoning
25.7
LCR
Long-Context Reasoning
21.3
SciCode
Reasoning Knowledge
17.1
𝜏²-Bench Telecom
Reasoning Knowledge
15.5
Artificial Analysis Intelligence Index
Knowledge
14.9
Artificial Analysis Coding Index
Coding
6.7
HLE
Knowledge Multi-Modal
4.4
Terminal-Bench Hard
Agent Coding
1.5