Group
What will be state-of-the-art performance on the MATH dataset on the following dates?
Make a Prediction
Date
resolution
Date
median
PDF
Forecast Timeline
Authors:
Opened:Jul 4, 2022
Closes:Jun 29, 2025
Scheduled resolution:Jun 29, 2025
Learn more about Metaculus NewsMatch
What will be state-of-the-art accuracy on the Massive Multitask dataset on the following dates?
94.6
What will be the best performance on FrontierMath by December 31st 2025?
65.7
For these benchmarks, what percentage of problems do you estimate the top-performing AI model or agent will be able to solve by December'25?
Comments
? comments
Authors:
Opened:Jul 4, 2022
Closes:Jun 29, 2025
Scheduled resolution:Jun 29, 2025
Learn more about Metaculus NewsMatch
What will be state-of-the-art accuracy on the Massive Multitask dataset on the following dates?
94.6
What will be the best performance on FrontierMath by December 31st 2025?
65.7