StepFun

Step3 VL 10B

Unknown Size

By StepFun • Released 2026-01-20

Capability Radar

Avg Score
23

Across all benchmarks

Participated
9
Benchmarks

Benchmark Performance

Benchmark Category Score
GPQA Diamond
Knowledge
69
IFBench
Agent
50.2
SciCode
Reasoning Knowledge
31.1
𝜏²-Bench Telecom
Reasoning Knowledge
16.1
Artificial Analysis Intelligence Index
Knowledge
15.4
Artificial Analysis Coding Index
Coding
13.9
HLE
Knowledge Multi-Modal
10.2
Terminal-Bench Hard
Agent Coding
5.3
LCR
Long-Context Reasoning
0