Baidu

ERNIE 5.0 Thinking Preview

Unknown Size

By Baidu • Released 2025-11-13

Capability Radar

Avg Score
49

Across all benchmarks

Participated
12
Benchmarks

Benchmark Performance

Benchmark Category Score
AIME 2025
Reasoning
85
𝜏²-Bench Telecom
Reasoning Knowledge
83.9
MMLU-Pro
Knowledge
83
LiveCodeBench
Coding
81.2
GPQA Diamond
Knowledge
77.7
IFBench
Agent
41.4
SciCode
Reasoning Knowledge
37.5
Artificial Analysis Coding Index
Coding
29.2
Artificial Analysis Intelligence Index
Knowledge
29.1
Terminal-Bench Hard
Agent Coding
25
HLE
Knowledge Multi-Modal
12.7
LCR
Long-Context Reasoning
6.7