OpenAI

gpt-oss-120B (high)

Unknown Size

By OpenAI • Released 2025-08-05

Capability Radar

Avg Score
53

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
AIME 2025
Reasoning
93.4
LiveCodeBench
Coding
87.8
MMLU-Pro
Knowledge
80.8
GPQA Diamond
Knowledge
78.2
IFBench
Agent
69
𝜏²-Bench Telecom
Reasoning Knowledge
65.8
LCR
Long-Context Reasoning
50.7
SciCode
Reasoning Knowledge
38.9
Artificial Analysis Intelligence Index
Knowledge
33.3
Artificial Analysis Coding Index
Coding
28.6
SWE-bench (Bash Only)
Coding Agent
26
Terminal-Bench Hard
Agent Coding
23.5
HLE
Knowledge Multi-Modal
18.5