M

This essay is now a Draft

The list of questions relevant to this contest is here. Once you submit your essay, it will be available for judges to review and will no longer be able to edit it. Please make sure to review the eligibility criteria before submitting. Thank you!

Pending

This content now needs to be approved by community moderators.

Submitted

This essay was submitted and is waiting for review.

How much of AI progress is from scaling compute? And how far will it scale?

{{"estimatedReadingTime" | translate:({minutes: qctrl.question.estimateReadingTime()})}}
AI Progress Essay Contest

This essay was submitted to the AI Progress Essay Contest, an initiative that focused on the timing and impact of transformative artificial intelligence. You can read the results of the contest and the winning essays here.


Over the past decade, the amount of compute used in the largest AI experiments has been increasing exponentially:

Source: OpenAI

Some of the questions this raises are:

  • How long can we expect this rate of growth to continue?
  • How much is scaling compute (hardware) contributing to AI progress compared to using the compute better (software)?

To summarize my conclusions:

  • The rate of growth of AI compute is forecasted to start slowing dramatically within the next 5-10 years, as it starts to approach an appreciable fraction of total global computing capacity and world GDP.
  • However, price-performance of AI models is also rapidly improving. So even if the dollars spent on the largest AI experiments levels off, we would still expect to see rapid (but decreased) improvement in the effective size of experiments we are able to perform.
  • I interpret the data to weakly suggest that hardware scaling contributes the bulk of AI progress today, but that software improvements will account for the majority of AI progress over the next few years as hardware scaling slows.

Link to spreadsheet with my numbers and charts

Estimating compute capacity

To understand how far we can expect to scale compute, let's try to estimate the total global computing capacity, and how fast it's growing. We can start with estimated global annual spending on computer hardware, which is roughly $1 trillion, and growing at a rate of about 3-4% annually (Statistica). A common estimate of compute capacity is about 3 years worth of this spending - roughly $3 trillion. For comparison, world GDP is ~$90 trillion and growing at ~3% annually.

Next, we look at the price-performance of computing hardware as FLOPS vs $. This chart shows a steady decline in the $ price per GFLOPS - prices generally halve every 1.5 years - in other words, you can buy double the computing power in FLOPS for the same dollar cost every 1.5 years. The current cost is estimated at about $0.3 / GFLOPS (AI Impacts).

Source: Wikipedia history of GFLOPS costs – AI Impacts

This data reflects CPUs and GPUs but not other accelerators e.g. TPUs. Given that I don't have better data, for simplicity I'll just use this data and guesstimate that TPUs will follow similar rates of improvement. I'll also note that the rates of change are very large and much more important than the precise current figures.

Additionally, GPU price-performance has barely improved in the last couple years due to severe shortages, but I predict this to be transitory.

Combining these two estimates puts global computing capacity at about 10^22 FLOPS, doubling every 1.5 years. The growth comes almost entirely from improved price-performance (~60% annual growth), while total compute hardware spending is growing at ~3-4% annually.

AI compute scaling

As of 2022, the maximum compute used to train an AI experiment is an estimated 17 petaflops-days or 10^16 FLOPS-days. The community forecast for this (starting 1 year before) was pretty spot-on.

This fits the exponential growth trend we've been seeing over the last decade, with maximum compute doubling every 3.4 months:

Source: OpenAI

If this growth rate were to continue, then by 2026 the maximum training compute would be 10^23 FLOPS-days, while global computing capacity would be 10^23 FLOPS if it continued growing at current rates (doubling every 1.5 years). In other words, the maximum AI experiment would require training for the equivalent of about a day of all global compute. And by a few years after that, the same growth rate would require years of all global compute, and the cost of that compute would exceed world GDP.


So this growth rate in AI computing scale is clearly unsustainable given the growth rate in total compute capacity. And the community indeed forecasts that AI compute will grow much more slowly going forwards - the median forecast is about FLOPS-days by 2026, and only about 10 times more than that by 2031. A growth rate of 10x over 5 years is in fact almost the same as the growth rate of total global computing capacity discussed above.


AI price performance

It's hard to quantify AI performance in a way that allows us to easily discuss "how fast AI is progressing", but we can look at a more easily measurable metric of price-performance: dollar cost to train to achieve a fixed accuracy on a given benchmark.

The 2022 AI Index report charts training time and dollar cost to achieve 93% accuracy on ImageNet. Over the 4-year period from 2017 to 2021, the cost in dollar decreased by a factor of 223. This decrease happened very non-linearly, but for the sake of making rough estimates I'll use the annualized rate: ~4x per year, or halving every half a year on average. (The data sources I used for this were particularly limited, so these estimates are very much only ballpark figures - additional data could help a lot here.)

Meanwhile, GPU cost per FLOPS has been halving every ~1.5 years, or about a factor of 6 over 4 years. (However, the hardware used in these experiments may not have followed the same trend as global hardware - this could be better answered if we had figures for both FLOPS-day and dollar costs.) We can infer that the vast majority of the training cost reduction over those 4 years was likely from software improvements (or more generally, taking better advantage of existing hardware, e.g. perhaps making wider use of the cheapest existing hardware like TPUs). The reduction attributed to software is roughly estimated as a factor of 223/6 = 37x reduction, or about 2.5x annual.

So even if the dollars spent on the largest AI experiments levels off, we would expect to be able to achieve the same results with much lower cost, which also suggests we should continue to see rapid progress on the state-of-the-art even holding costs constant - but of course not as rapid as if we were also able to continue growing the dollars spent as fast as we are today.

To try to quantify this, we can think of the rate of price reduction to achieve a given performance benchmark (~4x annual) as an indicator of "AI performance" improving by 4x annually for constant dollars spent. As I mentioned earlier, "AI performance" of course means very different things in different contexts (this may not generalize to problems different than ImageNet), but it can at least give us some rough starting point to think about.

We can compare this to the growth rate of the largest AI experiments in dollars of compute (~7x annual, calculated based on AI experiment FLOPS-days divided by the cost per FLOPS). So over the past few years, the total growth rate of "AI performance" on the largest experiments can be estimated as roughly 4x7 = 28x annual, including both the increased investment in AI and the increased effectiveness per dollar of that investment. If AI compute spending in dollars levels off (growing at a similar pace as world GDP), then the total growth rate of maximum "AI performance" would then be just the factor of 4x annual (which reflects both software improvements and hardware getting cheaper). So while today's doubling time of "AI performance" is ~2-3 months, the doubling time after spending levels off would become ~6 months. Again, these are extremely rough numbers based on extremely little data, plus these are very back-of-the-envelope calculations that I haven't done very carefully, so take them with an ocean of salt, but it may at least give some ballpark idea of how to extrapolate AI trends into the future.

Conclusion

  • The rate of growth of compute used to train the largest models is forecasted to start slowing dramatically within the next 4 years, as it starts to approach an appreciable fraction of total global computing capacity.
  • However, price-performance is also rapidly improving. The price to train to achieve a fixed accuracy on ImageNet decreased by a factor of ~200 over 4 years. (This reflects both software and hardware price-performance improvements.) So even if the dollars spent on the largest AI experiments grows more slowly, we would still expect to see rapid improvement in the effective size of experiments we are able to perform - although certainly not as rapid as if compute continued scaling as fast as today.
  • The price-performance data also weakly suggests that while hardware improvements are important, software improvements account for the vast majority of the AI price-performance improvements.
  • This means that although hardware appears to contribute the bulk of AI progress currently, software will become the main contributor as growth rate of AI compute slows over the next few years.


Submit Essay

Once you submit your essay, you can no longer edit it.