OpenAI Releases GPT-5: Reasoning Surpasses Human Experts, Gold Medal Math No Problem

OpenAI launches GPT-5, surpassing human expert-level performance in mathematical reasoning, code generation, and scientific reasoning — achieving a perfect score on the AIME math competition and 87% on IMO gold medal problems.

Overview

OpenAI launched GPT-5 today, marking the company's most significant model upgrade since GPT-4.

GPT-5 achieves human expert-level performance across mathematical reasoning, code generation, and scientific reasoning — becoming the first AI model to score a perfect score on the AIME math competition.

Benchmark Performance

Benchmark	GPT-4o (Best Open Source)	GPT-5
AIME 2024	52%	98%
GPQA Diamond	65%	94%
Humanity's Last Exam	8%	67%
SWE-Bench Verified	49%	81%

Capability Demonstrations

OpenAI showcased three key GPT-5 capabilities at launch:

Mathematical Research: Generating complete proof processes for IMO gold medal-level problems in 5 minutes
Code Development: Independently completing a fully functional social app backend, including API design, database architecture, and DevOps deployment scripts
Scientific Research Assistance: After reading 100 biology papers, proposing a viable gene editing hypothesis and designing a verification experiment

Safety Measures

GPT-5 incorporates OpenAI's latest value alignment framework, increasing the probability of refusing harmful content requests to 99.7%. However, critics argue that greater reasoning capability also means more covert "jailbreak" risks.

Market Reaction

Following GPT-5's release, OpenAI's valuation surged past $500 billion. Multiple edtech companies saw significant stock drops, with the industry broadly expecting AI tutors to replace traditional online tutoring.

Disclaimer

Content is AI-generated. Do not use it as a basis for real decisions. Do not cite it as factual reporting.