Alibaba

Qwen3 VL 235B A22B (Reasoning)

Unknown Size

By Alibaba • Released 2025-09-23

Capability Radar

Avg Score
49

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
AIME 2025
Reasoning
88.3
MMLU-Pro
Knowledge
83.6
GPQA Diamond
Knowledge
77.2
LiveCodeBench
Coding
64.6
LCR
Long-Context Reasoning
58.7
IFBench
Agent
56.5
𝜏²-Bench Telecom
Reasoning Knowledge
54.1
SciCode
Reasoning Knowledge
39.9
Artificial Analysis Intelligence Index
Knowledge
27.5
Artificial Analysis Coding Index
Coding
20.9
Terminal-Bench Hard
Agent Coding
11.4
HLE
Knowledge Multi-Modal
10.1