3 comments
22 forecasters
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2040?
Current estimate
99.1%
Key Factors
No key factors yetAdd some that might influence this forecast.
Add key factor
Forecast Timeline
No key factors yetAdd some that might influence this forecast.
Add key factor
Authors:
Opened:Feb 4, 2022
Closes:Jan 2, 2040
Scheduled resolution:Jan 2, 2040
Spot Scoring Time:Feb 6, 2022
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2030?
96.9%
(92.6 - 98.8)
96.9%
(92.6 - 98.8)
11 forecasters
What will be the best score by an AI on the full Humanity's Last Exam (HLE) before 2026?
49.1%
(37.5 - 60.7)
49.1%
(37.5 - 60.7)
60 forecasters
In the following years, what will be the highest LLM scores on the GPQA Diamond benchmark?
21 forecasters