Your submission is now in Draft mode.

Once it's ready, please submit your draft for review by our team of Community Moderators. Thank you!

Submit Essay

Once you submit your essay, you can no longer edit it.


This content now needs to be approved by community moderators.


This essay was submitted and is waiting for review.

AI Programming Competency before 2030


On May 31, 2022, prominent deep learning skeptic and NYU professor emeritus Gary Marcus challenged Elon Musk to a bet on AGI by the end of 2029. His proposed bet consists of 5 AI achievements, of which he predicted no more than 2 would come to pass before 2030. This question is about Marcus' fourth prediction,

In 2029, AI will not be able to reliably construct bug-free code of more than 10,000 lines from natural language specification or by interactions with a non-expert user. [Gluing together code from existing libraries doesn’t count.]

Will an AI be able to reliably construct bug-free code of more than 10,000 lines before 2030?

This question will resolve as Yes if before January 1, 2030, there is a public and credible demonstration of an AI writing code that clearly indicates the capability to do either of the following:

(1) Given a natural language description of a complex computer program comparable to the non-research related ideas found in this list of programming projects, the AI is able to write a computer program that satisfies the description to a satisfactory degree in at least 90.0% of cases. A computer program is said to have satisfied the conditions of a natural language description if there is a consensus among Metaculus admins that the code satisfies the conditions, without any major bugs. Minor bugs, such as the code occasionally crashing, will not disqualify any AI, as these are common even for professional human programmers.

(2) The AI is able to perform (1) when given the ability to interact with a non-expert user. A non-expert user is defined as someone who credibly reports not being able to write code that satisfies the conditions of these project ideas, but who is able to operate a computer well enough to understand whether a given computer program passes the requirements to a satisfactory degree.

Importantly, as per Marcus' constraint, we will not allow the AI to simply glue together code from existing libraries. It must generate code de novo, meaning that a plagiarism detector on par with the Copyleaks code plagiarism checker would not flag the code as definitively indicating cheating in more than 5% of cases.

Make a Prediction


Note: this question resolved before its original close time. All of your predictions came after the resolution, so you did not gain (or lose) any points for it.

Note: this question resolved before its original close time. You earned points up until the question resolution, but not afterwards.

Current points depend on your prediction, the community's prediction, and the result. Your total earned points are averaged over the lifetime of the question, so predict early to get as many points as possible! See the FAQ.