Gemini 2.5 Flash Preview (Sep '25) (Reasoning)

Unknown Size

By Google • Released 2025-09-25

Capability Radar

Avg Score

50

Across all benchmarks

Participated

12

Benchmarks

Benchmark Performance

Benchmark	Category	Score
MMLU-Pro	Knowledge	84.2
GPQA Diamond	Knowledge	79.3
AIME 2025	Reasoning	78.3
LiveCodeBench	Coding	71.3
LCR	Long-Context Reasoning	64.3
IFBench	Agent	52.3
𝜏²-Bench Telecom	Reasoning Knowledge	45.6
SciCode	Reasoning Knowledge	40.5
Artificial Analysis Intelligence Index	Knowledge	31.1
Artificial Analysis Coding Index	Coding	24.6
Terminal-Bench Hard	Agent Coding	16.7
HLE	Knowledge Multi-Modal	12.7