Alibaba

Qwen3 VL 8B Instruct

Unknown Size

By Alibaba • Released 2025-10-14

Capability Radar

Avg Score
24

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
68.6
GPQA Diamond
Knowledge
42.7
LiveCodeBench
Coding
33.2
IFBench
Agent
32.3
𝜏²-Bench Telecom
Reasoning Knowledge
29.2
AIME 2025
Reasoning
27.3
SciCode
Reasoning Knowledge
17.4
LCR
Long-Context Reasoning
15.3
Artificial Analysis Intelligence Index
Knowledge
14.3
Artificial Analysis Coding Index
Coding
7.3
HLE
Knowledge Multi-Modal
2.9
Terminal-Bench Hard
Agent Coding
2.3