Contributed by the Risk Threshold Forecasting community.

Question

When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a GPT-4.5 scale model by OpenAI?

Total Forecasters0
Community Prediction

Make a Prediction

PDF

CDF

QuartilescommunityMy Prediction
lower 25%...
median...
upper 75%...
Upper bound
>Jun 2040
No key factors yetAdd some that might influence this forecast.
Add key factor
Authors:
Opened:Jun 25, 2025
Closes:Jan 1, 2041
Scheduled resolution:Jan 1, 2041
Spot Scoring Time:Jan 1, 2041