28 comments
74 forecasters
By May 2020, will a single language model obtain an average score equal to or greater than 90% on the SuperGLUE benchmark?
85%chance
ResolvedNo
The community gave this a 85% chance, and it resolved No.
Forecast Timeline
Authors:
Opened:Aug 9, 2019
Closes:Dec 30, 2019
Resolved:May 3, 2020
Spot Scoring Time:Aug 10, 2019
What will be the best score by an AI on the full Humanity's Last Exam (HLE) before 2026?
60.8%
(51.6 - 72.1)
60.8%
(51.6 - 72.1)
47 forecasters
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2030?
96.7%
(92.3 - 98.7)
96.7%
(92.3 - 98.7)
11 forecasters
What will be the best non-human SAT-style score on the hard subset of the QuALITY dataset by January 1, 2040?
99.1%
(96.3 - 99.7)
99.1%
(96.3 - 99.7)
22 forecasters