Contributed by the JetBrains community.

0 comments
3 forecasters

For these benchmarks, what percentage of problems do you estimate the top-performing AI model or agent will be able to solve by December'25?

AI2 Reasoning Challenge97.9
Toloka's µ-MATH94.1
Graduate-Level Google-Proof Q&A90.8