M

/Risk Threshold Forecasting

Questions

Contributed by the Risk Threshold Forecasting community.

Contributed by the Risk Threshold Forecasting community.

When will 75% accuracy be reached on LAB-Bench Cloning Scenarios by a Claude Opus 4 scale model by Anthropic?

Current estimate

Key Factors

No key factors yetAdd some that might influence this forecast.

Add key factor

No forecasts yet

Forecast Timeline

No key factors yetAdd some that might influence this forecast.

Add key factor

Authors:

Opened:

Jun 25, 2025

Closes:

Jan 1, 2041

Scheduled resolution:

Jan 1, 2041

Spot Scoring Time:

Jun 28, 2025

Risk Threshold Forecasting

Artificial Intelligence

Anthropic launches Claude Sonnet 4.5, its best AI model for coding

TechCrunch•Sep 29, 2025

Anthropic unveils latest AI model, aiming to extend its lead in coding intelligence

Business Insider•Sep 29, 2025

Anthropic releases Claude Sonnet 4.5 in latest bid for AI agents and coding supremacy

Verge•Sep 29, 2025

Learn more about Metaculus NewsMatch

When will Anthropic reach or surpass ASL-4?

(27 Nov 2027 - Feb 2035)

(27 Nov 2027 - Feb 2035)

When will Anthropic first report that an AI system reached or surpassed CBRN risk level 4?

(24 Oct 2026 - Nov 2031)

(24 Oct 2026 - Nov 2031)

When will an AI achieve an score of 1.5 or higher in the RE-bench at any time budget between 8h and 32h?

(10 Jan 2027 - 26 Nov 2028)

(10 Jan 2027 - 26 Nov 2028)