Kimi

Kimi K2 0905

Unknown Size

By Kimi • Released 2025-09-05

Capability Radar

Avg Score
47

Across all benchmarks

Participated
13
Benchmarks

Benchmark Performance

Benchmark Category Score
MMLU-Pro
Knowledge
81.9
GPQA Diamond
Knowledge
76.7
𝜏²-Bench Telecom
Reasoning Knowledge
73.4
LiveCodeBench
Coding
61
AIME 2025
Reasoning
57.3
LCR
Long-Context Reasoning
52.3
SWE-bench (Bash Only)
Coding Agent
43.8
IFBench
Agent
41.7
Artificial Analysis Intelligence Index
Knowledge
30.8
SciCode
Reasoning Knowledge
30.7
Artificial Analysis Coding Index
Coding
25.9
Terminal-Bench Hard
Agent Coding
23.5
HLE
Knowledge Multi-Modal
6.3