• Questions
  • Tournaments
  • Services
  • News
  • Questions
  • Tournaments
  • Questions
  • Questions
Feed Home
👥
Communities
💎
Metaculus Cup
📈
Market Pulse Challenge
⚡
Current Events
🏛️
POTUS Predictions
🏆
Leaderboards
Topics
✨🔝
Top Questions
⏳
AI 2027
☀️
Bright Line Watch
🇹🇼🇨🇳
The Taiwan Tinderbox
🌍🤲
Forecast With GiveWell
categories
🦠
Health & Pandemics
🌱
Environment & Climate
☢️
Nuclear Technology & Risks
🤖
Artificial Intelligence
See all categories
  • About
  • API
  • FAQ
  • forecasting resources
  • For Journalists
  • Contact
  • Careers
GuidelinesPrivacy PolicyTerms of Use
ForbesScientific AmericanTimeVoxYale NewsNature

Contributed by the Unjournal Forecasting community.

How many evaluation packages will The Unjournal post in the year 2025?

Key Factor

Additional grant funding

Key Factor

Additional grant funding

Contributed by the Risk Threshold Forecasting community.

When will Anthropic first report that an AI system has reached or surpassed the following AI R&D risk levels?

00

Contributed by the Risk Threshold Forecasting community.

When will Google first report that an AI system has reached or surpassed the following Instrumental Reasoning risk levels?

00

On the DesignSafe-CI portal, how many publications will there be in June 2025?

0-20result: Yes
21-30result: No
31-40result: No
and 1 other

Contributed by the Risk Threshold Forecasting community.

When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Gemini 2.5 Pro scale model by Google?

Contributed by the Risk Threshold Forecasting community.

When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Grok 3 scale model by xAI?

Platform feature suggestions

112
2961 comments2961
Metaculus Meta

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Claude Opus 4 scale model by Anthropic?

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Claude Sonnet 4 scale model by Anthropic?

Contributed by the Risk Threshold Forecasting community.

When will 80% accuracy be achieved on Cybench by a Gemini 2.5 Pro scale model by Google?