Anthropic

Claude 2.1

Unknown Size

By Anthropic • Released 2023-11-21

Capability Radar

Avg Score
23

Across all benchmarks

Participated
8
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
49.5
MATH-500
Reasoning
37.4
GPQA Diamond
Knowledge
31.9
LiveCodeBench
Coding
19.5
SciCode
Reasoning Knowledge
18.4
Artificial Analysis Coding Index
Coding
14
Artificial Analysis Intelligence Index
Knowledge
9.3
HLE
Knowledge Multi-Modal
4.2