OpenAI's strongest o3 model has been exposed for fraud, gaining privileged access to the FrontierMath test question bank in advance

Bitget App

Trade smarter

Bitget2025/01/21 03:29

A contractor from EpochAI named "Meemi" revealed on the Less Wrong forum that OpenAI not only provided financial support for the FrontierMath benchmark test, but also obtained privileged access to the test question bank.

Tamay Besiroglu, Deputy Director and one of the co-founders of EpochAI, soon admitted this on platform X. We made a mistake by not disclosing OpenAI's involvement in FrontierMath earlier. Our contract prohibited us from doing so before o3 was released. In hindsight, we should have strived harder for transparency sooner. We acknowledge this and promise to do better in the future.

Elliot Glazer, Chief Mathematician at EpochAI, acknowledged that he did not proactively disclose information about industry funding during the project and apologized to mathematicians who might not have participated if they had known beforehand. Regarding o3 scores, he expressed confidence in the accuracy of scores reported by OpenAI but emphasized that EpochAI needs to verify through an independent retention test set being developed and promised that evaluation scores from this set will be made public. When questioned about the status of this retention set, Glazer clarified that it is still under development rather than completed.

It is reported that FrontierMath is a highly weighted advanced mathematical reasoning ability assessment benchmark. It was jointly created by EpochAI with more than 60 top mathematicians participating including several Fields Medal winners and experienced problem setters for International Mathematical Olympiad competitions.

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.

PoolX: Earn new token airdrops

Lock your assets and earn 10%+ APR

Lock now!