Kimi

Kimi K2.5 (Reasoning)

Unknown Size

By Kimi • Released 2026-01-27

Capability Radar

Avg Score
58

Across all benchmarks

Participated
9
Benchmarks

Benchmark Performance

Benchmark Category Score
𝜏²-Bench Telecom
Reasoning Knowledge
95.9
GPQA Diamond
Knowledge
87.9
IFBench
Agent
70.2
LCR
Long-Context Reasoning
65.3
SciCode
Reasoning Knowledge
49
Artificial Analysis Intelligence Index
Knowledge
46.7
Artificial Analysis Coding Index
Coding
39.5
Terminal-Bench Hard
Agent Coding
34.8
HLE
Knowledge Multi-Modal
29.4