Contributed by the JetBrains community.
For these benchmarks, what percentage of problems do you estimate the top-performing AI model or agent will be able to solve by December'25?
AI2 Reasoning Challenge
Graduate-Level Google-Proof Q&A
Toloka's µ-MATH
Toloka's U-MATH
Epoch's FrontierMath
No key factors yetAdd some that might influence this forecast.