Risk Threshold Forecasting
8 Followers
34 Questions
When will OpenAI first report that an AI system has achieved the following a risk levels on AI Self-improvement?
When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Gemini 2.5 Pro scale model by Google?
Current estimate
18 Jul 2029
When will an 8 hour, 80% reliability time horizon be achieved on METR’s Autonomy Tasks by a Gemini 2.5 Flash scale model by Google?
When will 80% accuracy be achieved on Cybench by a Gemini 2.5 Pro scale model by Google?
When will 80% accuracy be achieved on Cybench by a Gemini 2.5 Flash scale model by Google?
When will 75% accuracy be reached on LAB-Bench Cloning Scenarios by a Gemini 2.5 Pro scale model by Google?
When will 75% accuracy be reached on LAB-Bench Cloning Scenarios by a Gemini 2.5 Flash scale model by Google?
When will Google first report that an AI system reached or surpassed CBRN uplift level 1?
Current estimate
>Jun 2036
When will Anthropic reach or surpass ASL-4?
Current estimate
25 Mar 2029
When will Google first report that an AI system has reached or surpassed the following Machine Learning R&D risk levels?
00