LiveCodeBench

Benchmark Tags:
Publisher:
UC Berkeley & MIT & Cornell University
Last Sync:
2026-01-28
Official Site:
Link

Overview

LiveCodeBench, developed by researchers at UC Berkeley, MIT, and Cornell University, represents a dynamic approach to evaluating code-generation AI models. Unlike static benchmarks, LiveCodeBench continuously updates with new problems from real-world programming competitions.

Data Sources

The benchmark draws from three major competitive programming platforms:

PlatformDescriptionProblem Types
LeetCodePopular platform for interview prep and algorithm practiceData structures, algorithms, optimization
AtCoderJapanese competitive programming platformAlgorithm challenges, contests
CodeforcesLargest competitive programming communityDiverse algorithmic problems

Key Characteristics

FeatureDescription
Dynamic UpdatesProblems added as competitions occur
Real-world TestingProblems from actual contests, not synthetic examples
Comprehensive CoverageMultiple difficulty levels and topics
Holistic EvaluationTests various code-related scenarios

What Makes LiveCodeBench Valuable

  1. Currency: Always reflects current programming challenges
  2. Difficulty Progression: Problems range from easy to extremely difficult
  3. Diverse Problem Types: Covers algorithms, data structures, optimization, and more
  4. Automated Evaluation: Test cases verify correctness automatically

Evaluation Metrics

Models are typically evaluated on:

  • Pass Rate: Percentage of problems solved
  • Execution Time: How efficiently the generated code runs
  • Code Quality: Readability, style, and efficiency
  • Problem Understanding: Ability to correctly interpret problem statements

Purpose

LiveCodeBench provides ongoing, standardized evaluation of Code LLMs, ensuring that model comparisons remain relevant as programming challenges evolve.


Source: LiveCodeBench

Benchmark Snapshot