Is Decentralized AI Overhyped?

The Cherry Without the Cake

Rohan Tangri
3 min readNov 18, 2023
https://thesweetestoccasion.com/2010/06/pink-birthday-party-ideas-cherry-cake/

The AI Problem

It’s probably a good idea to start by defining the problem decentralized AI aims to solve. As a summary:

AI will make more and more important decisions in society with undesired bias

Basically, as AI models become more powerful, it’s not inconceivable that they will start being used to make more critical decisions in society (e.g. deciding which treatment a patient should have, or presenting legal evidence in a court of law). However, there is no current way to prove whether the model being run is giving unbiased results 🕵️. Essentially, a model host can either intentionally or accidentally provide a model that gives inaccurate answers with no way for the user to verify its quality.

One source of bias is the dataset used to train the model 💾. For example, suppose we are training a model to diagnose illness in patients. If the model is only trained on male medical data, it will likely give biased results when treating women.

Another issue could be hidden model inputs not provided by the user 🤐. You might think you know what inputs a model takes, and so roughly what features have been used in training, because you provide the input yourself. For example, with something like ChatGPT, you provide a text prompt, so you might assume that its output is purely a function of that input. However, it’s possible that the model takes in further inputs not provided by the user, but which are rather provided when training or at runtime by the model host. For example, ChatGPT requires an initial ‘prompt’ to tell it what kind of LLM it is. You can see this from the recent OpenAI dev day talk where you have to give an initial instruction to tell a GPT what sort of responses to give from then on.

One potential solution for this is to make models open source (like Twitter did for their recommendation engine). Open source code is certainly more easily auditable by the public; for example, any hidden model inputs would be laid bare. However, this still has some problems:

  1. 💰 There is little economic incentive for companies to open source models
  2. 👥 There is no way to prove that the open-source model is the one being run
  3. ⛔️ Model parameters and the training dataset are still kept secret for privacy and economic competitiveness reasons

How does Decentralized AI Solve this?

Well first of all, what is decentralized AI? This is basically the idea of ‘putting AI on the blockchain’. In this new paradigm, model hosts would make their code open source, and fancy schmancy cryptographic proofs (like ZKPs ✨) can be used to verify compute and model attributes without disclosing private data. Importantly, the model itself is run off-chain, it's only proofs that are processed on-chain, minimizing the cost-intensive work done on-chain. This has the potential to solve many of the leftover issues from just open sourcing.

Firstly, ZKPs could currently give users the power to verify that the model they think is being run is the one actually being run. However, the longer-term gold standard would be the ability for model hosts to prove that their model is unbiased without needing to disclose their model parameters or training dataset 🤯🤯

I’m currently undecided whether decentralized AI provides any better economic incentives vs open source alone. One argument is that a decentralized model marketplace could be opened, but the same model marketplace could also exist with centralized infrastructure 🤷

Why am I Sceptical?

The main problem is that decentralized AI doesn’t actually solve the critical issue of identifying unbiased models ☹️. Introducing the blockchain is all about providing verifiable proofs; however, neural network interpretability is still a fundamental research topic. OpenAI themselves don’t even know how GPT4 works from causal first principles, so they couldn’t prove their model is unbiased even if they wanted to.

I think in the future this will change, and so an argument could be made that we should put the on-chain infrastructure in place for when we do figure out how to interpret large neural networks. However, until that time, I have to conclude that decentralized AI is the cherry without the cake, and I wonder if we’re putting a disproportionate amount of effort into making sure the cherry is ripe 🍒

--

--