On May 31st, 2022, prominent deep learning skeptic and NYU professor emeritus Gary Marcus challenged Elon Musk to a bet on AGI by the end of 2029. His proposed bet consisted of 5 AI achievements, of which he predicted no more than 2 would come to pass before 2030. This question is about Marcus' first prediction,

In 2029, AI will not be able to watch a movie and tell you accurately what is going on (what I called the comprehension challenge in The New Yorker, in 2014). Who are the characters? What are their conflicts and motivations? etc.

For this challenge, we will use the MovieQA dataset as an illustrative example of a benchmark that could trigger positive resolution,

The dataset consists of 14,944 questions about 408 movies with high semantic diversity. The questions range from simpler "Who" did "What" to "Whom", to "Why" and "How" certain events occurred. Each question comes with a set of five possible answers; a correct one and four deceiving answers provided by human annotators.