• Questions
  • Tournaments
  • Services
  • News
  • Questions
  • Tournaments
  • Questions
  • Questions
34 comments
103 forecasters

What will be the state-of-the-art language modelling performance (in perplexity) on WikiText-103 by the following dates?

This question is closed for predictions, and is waiting to be resolved

Authors:
casens
Opened:Dec 14, 2020
Closed:Feb 13, 2025
Scheduled resolution:Dec 13, 2026
Forecasting AI Progress
AI Progress Essay Contest
AI Technical Benchmarks
Forecasting AI Progress: Hill Climbing Round
Forecasting AI Progress: Maximum Likelihood Round
Computing and Math
Artificial Intelligence
🏆 2021-2025 Leaderboard
🏆 2021 Leaderboard
🏆 2020-2021 Leaderboard

In the following years, what will be the highest LLM scores on the GPQA Diamond benchmark?

2024
87.7
2025
93
2026
98.2
1 other
21 forecasters

What will the be the state-of-the-art performance on image classification on ImageNet in top-1 accuracy on the following dates?

December 14, 2024
91.9
December 14, 2026
93.7
78 forecasters

When will a language model be developed that, when tested, yields approximately human-level output?

05 Jun 2024
(05 May 2023 - 10 Feb 2027)
05 Jun 2024
(05 May 2023 - 10 Feb 2027)
36 forecasters
  • About
  • API
  • FAQ
  • forecasting resources
  • For Journalists
  • Careers
GuidelinesPrivacy PolicyTerms of Use
ForbesScientific AmericanTimeVoxYale NewsNature