OpenAI Releases GPT-5: Reasoning Surpasses Human Experts, Gold Medal Math No Problem
OpenAI launches GPT-5, surpassing human expert-level performance in mathematical reasoning, code generation, and scientific reasoning — achieving a perfect score on the AIME math competition and 87% on IMO gold medal problems.
Overview
OpenAI launched GPT-5 today, marking the company's most significant model upgrade since GPT-4.
GPT-5 achieves human expert-level performance across mathematical reasoning, code generation, and scientific reasoning — becoming the first AI model to score a perfect score on the AIME math competition.
Benchmark Performance
| Benchmark | GPT-4o (Best Open Source) | GPT-5 |
|---|---|---|
| AIME 2024 | 52% | 98% |
| GPQA Diamond | 65% | 94% |
| Humanity's Last Exam | 8% | 67% |
| SWE-Bench Verified | 49% | 81% |
Capability Demonstrations
OpenAI showcased three key GPT-5 capabilities at launch:
- Mathematical Research: Generating complete proof processes for IMO gold medal-level problems in 5 minutes
- Code Development: Independently completing a fully functional social app backend, including API design, database architecture, and DevOps deployment scripts
- Scientific Research Assistance: After reading 100 biology papers, proposing a viable gene editing hypothesis and designing a verification experiment
Safety Measures
GPT-5 incorporates OpenAI's latest value alignment framework, increasing the probability of refusing harmful content requests to 99.7%. However, critics argue that greater reasoning capability also means more covert "jailbreak" risks.
Market Reaction
Following GPT-5's release, OpenAI's valuation surged past $500 billion. Multiple edtech companies saw significant stock drops, with the industry broadly expecting AI tutors to replace traditional online tutoring.
Disclaimer
Content is AI-generated. Do not use it as a basis for real decisions. Do not cite it as factual reporting.