Alibaba

Qwen3 235B A22B 2507 (Reasoning)

Unknown Size

By Alibaba • Released 2025-07-25

Capability Radar

Avg Score
56

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
98.4
AIME 2025
Reasoning
91
MMLU-Pro
Knowledge
84.3
GPQA Diamond
Knowledge
79
LiveCodeBench
Coding
78.8
LCR
Long-Context Reasoning
67
𝜏²-Bench Telecom
Reasoning Knowledge
53.2
IFBench
Agent
51.2
SciCode
Reasoning Knowledge
42.4
Artificial Analysis Intelligence Index
Knowledge
29.5
Artificial Analysis Coding Index
Coding
23.2
HLE
Knowledge Multi-Modal
15
Terminal-Bench Hard
Agent Coding
13.6