Anthropic

Claude 3.5 Haiku

Unknown Size

By Anthropic • Released 2024-10-22

Capability Radar

Avg Score
30

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
72.1
MMLU-Pro
Knowledge
63.4
IFBench
Agent
42.8
GPQA Diamond
Knowledge
40.8
LiveCodeBench
Coding
31.4
SciCode
Reasoning Knowledge
27.4
𝜏²-Bench Telecom
Reasoning Knowledge
24.6
LCR
Long-Context Reasoning
23.3
Artificial Analysis Intelligence Index
Knowledge
18.7
Artificial Analysis Coding Index
Coding
10.7
HLE
Knowledge Multi-Modal
3.5
Terminal-Bench Hard
Agent Coding
2.3