Contributed by the Harvard's AI Safety Student Team community.
Question
What will be the best performance on SWE-bench Verified by December 31st 2025?
Total Forecasters10
Community Prediction
94.2
(90.5 - 96.9)
Make a Prediction
CDF
This question is closed for predictions, and is waiting to be resolved
Lower bound | community | My Prediction |
<64.6 | 0.4% | — |
Quartiles | ||
lower 25% | 90.48 | — |
median | 94.22 | — |
upper 75% | 96.93 | — |
Authors:
Opened:Feb 9, 2025
Closed:Feb 17, 2025
Scheduled resolution:Jan 1, 2026
Spot Scoring Time:Feb 17, 2025
When will an AI achieve an score of 1.5 or higher in the RE-bench at any time budget between 8h and 32h?
07 Jan 2027
What will be the best score by an AI on the full Humanity's Last Exam (HLE) before 2026?
56.3
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2030?
97
Authors:
Opened:Feb 9, 2025
Closed:Feb 17, 2025
Scheduled resolution:Jan 1, 2026
Spot Scoring Time:Feb 17, 2025
When will an AI achieve an score of 1.5 or higher in the RE-bench at any time budget between 8h and 32h?
07 Jan 2027
What will be the best score by an AI on the full Humanity's Last Exam (HLE) before 2026?
56.3
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2030?
97