Will an AI system do credibly well on a full math SAT exam by 2025?

Humans have devised many ways of assessing other humans' intelligence, and forcing people to participate in such measures. University entrance exams are one of the most familiar, inflicted on countless high school students each year as standardized measures of academic competence and promise. Recently, these exams have begun the target of AI and machine learning projects.

According to a report by Engadget, Japan’s National Institute of Informatics had been working on an AI since 2011 with the final objective of passing the entrance exam for the University of Tokyo, tentatively by March 2022. However, a recent report has revealed that the institute will be terminating the project because of its AI's inability to fully understand the broad context of the entrance exam questions.

More recently, on September 21, 2015, the Allen Institute for Artificial Intelligence (AI2) announced in a paper that it created an AI system called GeoS that can solve SAT geometry questions "as well as the average 11th-grade American student." According to this story GeoS "uses a combination of computer vision to interpret diagrams, natural language processing to read and understand text, and a geometric solver to achieve 49 percent accuracy on geometry questions from the official SAT tests. If these results were extrapolated to the entire Math SAT test, the computer roughly achieved an SAT score of 500 (out of 800), the average test score for 2015." Although AI2 initially focused GeoS on solving plane geometry questions, it hopes to move to solve the full set of Math SAT questions by 2018.

This is not an easy feat; however it may be significantly more difficult to actually do decently well on such an exam, including all sections. We ask:

By end of 2025, will an AI system achieve the equivalent of 75th percentile on the full mathematics section of an SAT exam comparable to those circa 2015?

Resolution is by credible media report or published paper. The system must be given only page images, and trained on exams that do not include any questions from the scored test. Exams will count as long as the topics and difficulty is broadly comparable to the 2015 exams.


