Humanity's Last Exam

Publisher:
Center for AI Safety & Scale AI
Last Sync:
2026-02-12
Official Site:
Link

Overview

Humanity’s Last Exam represents one of the most ambitious academic benchmarking efforts ever undertaken. Created by the Center for AI Safety and Scale AI, this benchmark aims to push AI models to their absolute limits across virtually every academic domain.

Key Statistics

MetricValue
Total Questions2,500
Subject AreasDozens of subjects
Question TypesMultiple-choice, short-answer
GradingAutomated grading compatible

Subject Coverage

HLE spans an extraordinarily broad range of disciplines:

  • Mathematics: From advanced calculus to abstract algebra
  • Humanities: Literature, philosophy, history, political science
  • Natural Sciences: Physics, chemistry, biology, astronomy
  • Social Sciences: Economics, psychology, sociology
  • Professional Fields: Law, medicine, engineering

Development Process

Questions in HLE were developed globally by subject-matter experts, ensuring:

  • High quality and accuracy
  • True representation of expert-level challenges
  • Resistance to memorization-based approaches
  • Coverage of both common and obscure topics

Purpose

HLE is designed to be the final closed-ended academic benchmark of its kind - a comprehensive test that can definitively measure whether AI has achieved expert-level performance across human knowledge domains.


Source: Last Exam

Benchmark Snapshot