The list of questions relevant to this contest is here. Once you submit your essay, it will be available for judges to review and will no longer be able to edit it. Please make sure to review the eligibility criteria before submitting. Thank you!
This content now needs to be approved by community moderators.
This essay was submitted and is waiting for review.
This essay was submitted to the AI Progress Essay Contest, an initiative that focused on the timing and impact of transformative artificial intelligence. You can read the results of the contest and the winning essays here.
Over the past decade, the amount of compute used in the largest AI experiments has been increasing exponentially:
Source: OpenAI
Some of the questions this raises are:
To summarize my conclusions:
Link to spreadsheet with my numbers and charts
To understand how far we can expect to scale compute, let's try to estimate the total global computing capacity, and how fast it's growing. We can start with estimated global annual spending on computer hardware, which is roughly $1 trillion, and growing at a rate of about 3-4% annually (Statistica). A common estimate of compute capacity is about 3 years worth of this spending - roughly $3 trillion. For comparison, world GDP is ~$90 trillion and growing at ~3% annually.
Next, we look at the price-performance of computing hardware as FLOPS vs $. This chart shows a steady decline in the $ price per GFLOPS - prices generally halve every 1.5 years - in other words, you can buy double the computing power in FLOPS for the same dollar cost every 1.5 years. The current cost is estimated at about $0.3 / GFLOPS (AI Impacts).
Source: Wikipedia history of GFLOPS costs – AI Impacts
This data reflects CPUs and GPUs but not other accelerators e.g. TPUs. Given that I don't have better data, for simplicity I'll just use this data and guesstimate that TPUs will follow similar rates of improvement. I'll also note that the rates of change are very large and much more important than the precise current figures.
Additionally, GPU price-performance has barely improved in the last couple years due to severe shortages, but I predict this to be transitory.
As of 2022, the maximum compute used to train an AI experiment is an estimated 17 petaflops-days or 10^16 FLOPS-days. The community forecast for this (starting 1 year before) was pretty spot-on.
Source: OpenAI
If this growth rate were to continue, then by 2026 the maximum training compute would be 10^23 FLOPS-days, while global computing capacity would be 10^23 FLOPS if it continued growing at current rates (doubling every 1.5 years). In other words, the maximum AI experiment would require training for the equivalent of about a day of all global compute. And by a few years after that, the same growth rate would require years of all global compute, and the cost of that compute would exceed world GDP.
So this growth rate in AI computing scale is clearly unsustainable given the growth rate in total compute capacity. And the community indeed forecasts that AI compute will grow much more slowly going forwards - the median forecast is about FLOPS-days by 2026, and only about 10 times more than that by 2031. A growth rate of 10x over 5 years is in fact almost the same as the growth rate of total global computing capacity discussed above.
It's hard to quantify AI performance in a way that allows us to easily discuss "how fast AI is progressing", but we can look at a more easily measurable metric of price-performance: dollar cost to train to achieve a fixed accuracy on a given benchmark.
The 2022 AI Index report charts training time and dollar cost to achieve 93% accuracy on ImageNet. Over the 4-year period from 2017 to 2021, the cost in dollar decreased by a factor of 223. This decrease happened very non-linearly, but for the sake of making rough estimates I'll use the annualized rate: ~4x per year, or halving every half a year on average. (The data sources I used for this were particularly limited, so these estimates are very much only ballpark figures - additional data could help a lot here.)
Meanwhile, GPU cost per FLOPS has been halving every ~1.5 years, or about a factor of 6 over 4 years. (However, the hardware used in these experiments may not have followed the same trend as global hardware - this could be better answered if we had figures for both FLOPS-day and dollar costs.) We can infer that the vast majority of the training cost reduction over those 4 years was likely from software improvements (or more generally, taking better advantage of existing hardware, e.g. perhaps making wider use of the cheapest existing hardware like TPUs). The reduction attributed to software is roughly estimated as a factor of 223/6 = 37x reduction, or about 2.5x annual.
So even if the dollars spent on the largest AI experiments levels off, we would expect to be able to achieve the same results with much lower cost, which also suggests we should continue to see rapid progress on the state-of-the-art even holding costs constant - but of course not as rapid as if we were also able to continue growing the dollars spent as fast as we are today.
To try to quantify this, we can think of the rate of price reduction to achieve a given performance benchmark (~4x annual) as an indicator of "AI performance" improving by 4x annually for constant dollars spent. As I mentioned earlier, "AI performance" of course means very different things in different contexts (this may not generalize to problems different than ImageNet), but it can at least give us some rough starting point to think about.
We can compare this to the growth rate of the largest AI experiments in dollars of compute (~7x annual, calculated based on AI experiment FLOPS-days divided by the cost per FLOPS). So over the past few years, the total growth rate of "AI performance" on the largest experiments can be estimated as roughly 4x7 = 28x annual, including both the increased investment in AI and the increased effectiveness per dollar of that investment. If AI compute spending in dollars levels off (growing at a similar pace as world GDP), then the total growth rate of maximum "AI performance" would then be just the factor of 4x annual (which reflects both software improvements and hardware getting cheaper). So while today's doubling time of "AI performance" is ~2-3 months, the doubling time after spending levels off would become ~6 months. Again, these are extremely rough numbers based on extremely little data, plus these are very back-of-the-envelope calculations that I haven't done very carefully, so take them with an ocean of salt, but it may at least give some ballpark idea of how to extrapolate AI trends into the future.
Once you submit your essay, you can no longer edit it.