Anthropic

Claude Opus 4.5 (Non-reasoning)

Unknown Size

By Anthropic • Released 2025-11-24

Capability Radar

Avg Score
57

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
88.9
𝜏²-Bench Telecom
Reasoning Knowledge
86.3
GPQA Diamond
Knowledge
81
LiveCodeBench
Coding
73.8
LCR
Long-Context Reasoning
65.3
AIME 2025
Reasoning
62.7
SciCode
Reasoning Knowledge
47
Artificial Analysis Intelligence Index
Knowledge
43
IFBench
Agent
43
Artificial Analysis Coding Index
Coding
42.9
Terminal-Bench Hard
Agent Coding
40.9
HLE
Knowledge Multi-Modal
12.9