Alibaba

Qwen3 VL 8B (Reasoning)

Unknown Size

By Alibaba • Released 2025-10-14

Capability Radar

Avg Score
29

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
74.9
GPQA Diamond
Knowledge
57.9
IFBench
Agent
39.9
LiveCodeBench
Coding
35.3
LCR
Long-Context Reasoning
31
AIME 2025
Reasoning
30.7
𝜏²-Bench Telecom
Reasoning Knowledge
22.5
SciCode
Reasoning Knowledge
21.9
Artificial Analysis Intelligence Index
Knowledge
16.6
Artificial Analysis Coding Index
Coding
9.8
Terminal-Bench Hard
Agent Coding
3.8
HLE
Knowledge Multi-Modal
3.3