Alibaba

Qwen3 235B A22B 2507 Instruct

Unknown Size

By Alibaba • Released 2025-07-21

Capability Radar

Avg Score
46

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
98
MMLU-Pro
Knowledge
82.8
GPQA Diamond
Knowledge
75.3
AIME 2025
Reasoning
71.7
LiveCodeBench
Coding
52.4
IFBench
Agent
46.1
SciCode
Reasoning Knowledge
36
𝜏²-Bench Telecom
Reasoning Knowledge
33.3
LCR
Long-Context Reasoning
31.2
Artificial Analysis Intelligence Index
Knowledge
24.7
Artificial Analysis Coding Index
Coding
22.1
Terminal-Bench Hard
Agent Coding
15.2
HLE
Knowledge Multi-Modal
10.6