When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a GPT-4.5 scale model by OpenAI?

Total Forecasters0

Community PredictionNo forecasts yet

Make a Prediction

PDF

CDF

	lower 25%	median	upper 75%	>Jun 2040
community	...	...	...	—
My Prediction	—	—	—	—

Quartiles	community	My Prediction
lower 25%	...	—
median	...	—
upper 75%	...	—
Upper bound
>Jun 2040	—	—

No key factors yetAdd some that might influence this forecast.

Add key factor

Authors:

romeodean

Opened:

Jun 25, 2025

Closes:

Jan 1, 2041

Scheduled resolution:

Jan 1, 2041

Spot Scoring Time:

Jun 28, 2025

Risk Threshold Forecasting

Artificial Intelligence

ChatGPT: Everything you need to know about the AI chatbot

TechCrunch•Aug 29, 2025

Learn more about Metaculus NewsMatch

When will an AI achieve an score of 1.5 or higher in the RE-bench at any time budget between 8h and 32h?

20 Nov 2027

When will Anthropic reach or surpass ASL-4?

Nov 2030

When will an AI model trained with the following orders of magnitude more compute than GPT-4 be released?

31 May 2029

Authors:

romeodean

Opened:

Jun 25, 2025

Closes:

Jan 1, 2041

Scheduled resolution:

Jan 1, 2041

Spot Scoring Time:

Jun 28, 2025

Risk Threshold Forecasting

Artificial Intelligence

ChatGPT: Everything you need to know about the AI chatbot

TechCrunch•Aug 29, 2025

Learn more about Metaculus NewsMatch

When will an AI achieve an score of 1.5 or higher in the RE-bench at any time budget between 8h and 32h?

20 Nov 2027

When will Anthropic reach or surpass ASL-4?

Nov 2030

When will an AI model trained with the following orders of magnitude more compute than GPT-4 be released?

31 May 2029