Anthropic

Claude Opus 4.5 (Reasoning)

Unknown Size

By Anthropic • Released 2025-11-24

Capability Radar

Avg Score
67

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
AIME 2025
Reasoning
91.3
MMLU-Pro
Knowledge
89.5
𝜏²-Bench Telecom
Reasoning Knowledge
89.5
LiveCodeBench
Coding
87.1
GPQA Diamond
Knowledge
86.6
SWE-bench (Bash Only)
Coding Agent
74.4
LCR
Long-Context Reasoning
74
IFBench
Agent
58
Artificial Analysis Intelligence Index
Knowledge
49.7
SciCode
Reasoning Knowledge
49.5
Artificial Analysis Coding Index
Coding
47.8
Terminal-Bench Hard
Agent Coding
47
HLE
Knowledge Multi-Modal
28.4