Google

Gemini 2.0 Flash Thinking Experimental (Jan '25)

Unknown Size

By Google • Released 2025-01-21

Capability Radar

Avg Score
45

Across all benchmarks

Participated
8
Benchmarks

Benchmark Performance

Benchmark Category Score
MATH-500
Reasoning
94.4
MMLU-Pro
Knowledge
79.8
GPQA Diamond
Knowledge
70.1
SciCode
Reasoning Knowledge
32.9
LiveCodeBench
Coding
32.1
Artificial Analysis Coding Index
Coding
24.1
Artificial Analysis Intelligence Index
Knowledge
19.6
HLE
Knowledge Multi-Modal
7.1