Contributed by the Risk Threshold Forecasting community.

When will 75% accuracy be reached on LAB-Bench Cloning Scenarios by a Gemini 2.5 Flash scale model by Google?

Current estimate