Az OpenAI belső modellje megoldásokat próbál találni a Frontier Mathematics Challenge First Proof feladataira

AI REASONING

Az OpenAI belső modellje megoldásokat próbál találni a Frontier Mathematics Challenge First Proof feladataira

2026. február 23. · MI Történik? · 1 perc olvasás

OpenAI ran an internal reasoning model on all ten problems in First Proof, a research-level competition requiring full end-to-end proofs in specialized domains of mathematics. The company submitted proof attempts on February 14, 2026, and reports that at least five attempts have a high probability of being correct based on expert feedback, though several remain under review. The model initially produced what OpenAI believed was a correct proof for problem 2, but community analysis revealed it to be incorrect. OpenAI acknowledges the evaluation process was not as rigorous as a properly controlled study and plans to discuss more structured frameworks for future iterations.

At least five proof attempts have a high probability of being correct based on expert feedback
Researchers used a rapid sprint with retry strategies and expanded proofs for clarity
The challenge tests sustaining long reasoning chains and selecting appropriate abstractions
Community analysis identified an incorrect proof initially believed to be correct by the model

Eredeti forrás megtekintése (angol) →