Allen Institute for AI

Olmo 3 7B Think

Unknown Size

By Allen Institute for AI • Released 2025-11-20

Capability Radar

Avg Score
28

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
AIME 2025
Reasoning
70.7
MMLU-Pro
Knowledge
65.5
LiveCodeBench
Coding
61.7
GPQA Diamond
Knowledge
51.6
IFBench
Agent
41.5
SciCode
Reasoning Knowledge
21.2
Artificial Analysis Intelligence Index
Knowledge
9.5
Artificial Analysis Coding Index
Coding
7.6
HLE
Knowledge Multi-Modal
5.7
Terminal-Bench Hard
Agent Coding
0.8
LCR
Long-Context Reasoning
0
𝜏²-Bench Telecom
Reasoning Knowledge
0