AI REASONING
Az OpenAI belső modellje megoldásokat próbál találni a Frontier Mathematics Challenge First Proof feladataira
OpenAI ran an internal reasoning model on all ten problems in First Proof, a research-level competition requiring full end-to-end proofs in specialized domains of mathematics. The company submitted proof attempts on February 14, 2026, and reports that at least five attempts have a high probability of being correct based on expert feedback, though several remain under review. The model initially produced what OpenAI believed was a correct proof for problem 2, but community analysis revealed it to be incorrect. OpenAI acknowledges the evaluation process was not as rigorous as a properly controlled study and plans to discuss more structured frameworks for future iterations.
- At least five proof attempts have a high probability of being correct based on expert feedback
- Researchers used a rapid sprint with retry strategies and expanded proofs for clarity
- The challenge tests sustaining long reasoning chains and selecting appropriate abstractions
- Community analysis identified an incorrect proof initially believed to be correct by the model