Allen Institute for AI

Llama 3.1 Tulu3 405B

Unknown Size

By Allen Institute for AI • Released 2025-01-30

Capability Radar

Avg Score
40

Across all benchmarks

Participated
7
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
77.8
MMLU-Pro
Knowledge
71.6
GPQA Diamond
Knowledge
51.6
SciCode
Reasoning Knowledge
30.2
LiveCodeBench
Coding
29.1
Artificial Analysis Intelligence Index
Knowledge
14.1
HLE
Knowledge Multi-Modal
3.5