Kimi

Kimi K2

Unknown Size

By Kimi • Released 2025-07-11

Capability Radar

Avg Score
49

Across all benchmarks

Participated
14
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
97.1
MMLU-Pro
Knowledge
82.4
GPQA Diamond
Knowledge
76.6
τ-bench
Agent Knowledge
64.3
𝜏²-Bench Telecom
Reasoning Knowledge
61.1
AIME 2025
Reasoning
57
LiveCodeBench
Coding
55.6
LCR
Long-Context Reasoning
51
IFBench
Agent
41.5
SciCode
Reasoning Knowledge
34.5
Artificial Analysis Intelligence Index
Knowledge
26.2
Artificial Analysis Coding Index
Coding
22.1
Terminal-Bench Hard
Agent Coding
15.9
HLE
Knowledge Multi-Modal
7