Meta

Llama 3.1 Instruct 8B

Unknown Size

By Meta • Released 2024-07-23

Capability Radar

Avg Score
18

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
51.9
MMLU-Pro
Knowledge
47.6
IFBench
Agent
28.6
GPQA Diamond
Knowledge
25.9
𝜏²-Bench Telecom
Reasoning Knowledge
16.4
LCR
Long-Context Reasoning
15.7
SciCode
Reasoning Knowledge
13.2
Artificial Analysis Intelligence Index
Knowledge
11.7
LiveCodeBench
Coding
11.6
HLE
Knowledge Multi-Modal
5.1
Artificial Analysis Coding Index
Coding
4.9
AIME 2025
Reasoning
4.3
Terminal-Bench Hard
Agent Coding
0.8