Question Feed | Metaculus

Questions
Tournaments
Services
News

Questions
Tournaments

Questions

Questions

Market Pulse Challenge

POTUS Predictions

Topics

Bright Line Watch

🇹🇼🇨🇳

The Taiwan Tinderbox

Forecast With GiveWell

categories

Health & Pandemics

Environment & Climate

Nuclear Technology & Risks

Artificial Intelligence

See all categories

About
API

FAQ
forecasting resources
For Journalists

Contact
Careers

Guidelines Privacy Policy Terms of Use

Contributed by the Unjournal Forecasting community.

How many evaluation packages will The Unjournal post in the year 2025?

Key Factor

Additional grant funding

Key Factor

Additional grant funding

Contributed by the Risk Threshold Forecasting community.

When will Anthropic first report that an AI system has reached or surpassed the following AI R&D risk levels?

Contributed by the Risk Threshold Forecasting community.

When will Google first report that an AI system has reached or surpassed the following Instrumental Reasoning risk levels?

On the DesignSafe-CI portal, how many publications will there be in June 2025?

0-20result: Yes

21-30result: No

31-40result: No

Contributed by the Risk Threshold Forecasting community.

When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Gemini 2.5 Pro scale model by Google?

Contributed by the Risk Threshold Forecasting community.

When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Grok 3 scale model by xAI?

Platform feature suggestions

112

2961 comments2961

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Claude Opus 4 scale model by Anthropic?

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Claude Sonnet 4 scale model by Anthropic?

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Gemini 2.5 Pro scale model by Google?