OpenAI

o1

Unknown Size

By OpenAI • Released 2024-12-05

Capability Radar

Avg Score
52

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
97
MMLU-Pro
Knowledge
84.1
GPQA Diamond
Knowledge
74.7
IFBench
Agent
70.3
LiveCodeBench
Coding
67.9
𝜏²-Bench Telecom
Reasoning Knowledge
62.6
LCR
Long-Context Reasoning
59.3
SciCode
Reasoning Knowledge
35.8
Artificial Analysis Intelligence Index
Knowledge
30.7
Artificial Analysis Coding Index
Coding
20.5
Terminal-Bench Hard
Agent Coding
12.9
HLE
Knowledge Multi-Modal
7.7