Member-only story
The AI Compute Problem
A Case for Analogue Computers
It feels like everyone is talking about AI now since OpenAI released ChatGPT, you know it’s big when mainstream news outlets start covering it! AI as an academic discipline has been around for a long time though, and this isn’t the first ‘bull run’ it’s had (anyone remember the massive hype when CNNs first came to revolutionize computer vision?) 📈. Interestingly, if you pick up a textbook on the fundamental concepts of AI from 20 years ago and compare it to one today, most of the material will be pretty much unchanged. The advancement of AI in recent years has actually been mainly engineering lead rather than maths lead 🧑💻. In particular, most of the gains we see come from scaling neural networks to trillions of parameters.
Why do modern deep networks require a lot of compute?
As the number of trainable parameters increases, the computation required to both train a model and do inference increases massively. One reason is we start running into the curse of dimensionality. When training a model, we’re basically minimizing a loss function with respect to the trainable model parameters. Imagine trying to minimize a function: 1) with respect to a single parameter vs 2) with respect to two parameters. The first problem is 2D whilst the second problem is 3D. This change…