Alibaba

Qwen3 VL 4B Instruct

Unknown Size

By Alibaba • Released 2025-10-14

Capability Radar

Avg Score
22

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
63.4
GPQA Diamond
Knowledge
37.1
AIME 2025
Reasoning
37
IFBench
Agent
31.8
LiveCodeBench
Coding
29
𝜏²-Bench Telecom
Reasoning Knowledge
23.4
SciCode
Reasoning Knowledge
13.7
LCR
Long-Context Reasoning
13
Artificial Analysis Intelligence Index
Knowledge
9.5
Artificial Analysis Coding Index
Coding
4.5
HLE
Knowledge Multi-Modal
3.7
Terminal-Bench Hard
Agent Coding
0