What is AI?

Time to deep dive into what people really want to know.

Our First Deep Dive: AI

Will and I have spent the last few newsletters talking about a host of things, ranging from space factories to miracle weight-loss drugs to AI girlfriends, but a theme that has found itself in many issues is that of Artificial Intelligence (AI).

We’ve written about abstract topics like large language models (LLMs), generativeAI and a host of other topics associated with AI without ever really going into what AI is.

So this is our first deep dive newsletter. Once a month, we will go deep on a topic to really explain what it means in the most simplistic way possible.

This week we will try to shine a light on AI so by the end of this newsletter, when AI inevitably comes up at a friend’s dinner party, you can at least pretend to know what you’re talking about.

The AI umbrella

The AI umbrella - Copyright Jack Needham using Paint.

You can think of AI as an umbrella term to describe all sorts of things. It describes the software, it describes the outputs of that software, and it describes ways in which it “learns”. It includes applications like self-driving cars facial recognition and even silly Snapchat filters.

Ok, we’re no closer to answering the core question of what AI really is.

So what is AI?

I can’t answer that. Sorry to let you down so early on, but what became obvious when thinking about this is that “What is AI?” is just a bad question, it is too broad to mean anything.

Asking “What is AI?” is like asking “What is work?” Well, we could give you a super broad definition but work for a miner and superstar football player are so different that it doesn’t really help you navigate the answer practically.

To reference one of our favourite books, A Hitchhiker’s Guide to the Galaxy, asking the right question is often more important than getting the right answer.

So maybe a better starting point is defining the types and verticals of Artificial Intelligence. As we begin to explain these verticals, you can begin to connect the dots on what this world means and its implications for the future. 

Connecting the dots under the AI umbrella. Copyright Jack Needham using Paint.

Generative AI

I think for most people reading this, AI is broadly synonymous with generative AI. This is a category of artificial intelligence that is responsible for all the AI images you have seen, tools like ChatGPT, and voice cloning models. 

Generative AI uses huge amounts of data (text, images, audio) to build models that create new content and information, whether that be text, image, or audio.

  1. Large Language Models (like ChatGPT)

Large Language Models or LLMs will be one of the categories of generative AI you might be hearing a lot about. 

Wikipedia says a language model is a “probability distribution over a sequence of words.” I can translate that to mean they predict the next word in a sentence by understanding what came before it. If you open up your phone and type “I”, one of the keyboard suggestions will likely be “am” as the next word; this is a very rudimentary form of a language model.

What AI teams figured out was that the more data (text) you give the model, the better it becomes at completing requested tasks. Large language models, therefore, work off the same principles as above but consume a gigantic data set (basically a massive file of text); chatGPT-3 has 175 billion parameters (think of parameters as little personal trainers trying to make the AI better) and was trained on 570GBs of text (think of this as the data that is given to the 175 billion personal model trainers).

This is a cute explanation, but when you are at your friend’s dinner party, they are going to think you are a loser if you say “175 billion parameters.”

The questions that come up at dinner are usually around whether ChatGPT is going to take everyone’s jobs. And then take over the world?

So is ChatGPT going to take over the world? Probably not. Ok, but is it smarter than a human? Sort of. But machines are better than humans at a lot of things. 

It’s time for the complicated bit, so stick with us. AIs like ChatGPT get a lot of stuff wrong.

Rather than knowing X happened because of Y, they think “When X happens, Y is the most probable outcome.”

That means if you ask the question “What does 2+2 equal?,” instead of recognizing the number 2 and doing the basic maths, its answer will have the following reasoning; “Based on my training, what is the most frequently given answer to the question “What does 2+2 equal?”

For this reason, they “hallucinate” meaning they give you wrong answers. The computer might give 5 for example.

A good analogy I read the other day is that AIs and LLMs by extension are a very scalable version of an intern. You can ask it to write you a newsletter on AI, give you holiday ideas, or respond to customer feedback and they’ll probably do a decent job on it but chunks of it will almost certainly be wrong.

  1. AI image generators (Dalle, Midjourney)

You can go into ChatGPT now and ask it to give you an image of something. Here is an image of me as an anime skateboarder. 

Honestly wish I was this cool

The best models for image generation are called DALL-E, Midjourney, and Stable Diffusion.

But how does DALL-E do that? It’s trained in a broadly similar way to the large language models. You feed a model an incredible amount of data. In this case, the data is images that contain other pieces of information like labels. This helps the models understand a dog is a dog. For a human, it’s obvious what a dog is, but computers are stupid so you have to show them 1,000s of images of dogs and say that bit there is a dog.

Once your model is trained, you can use it. So when you ask DALL-E to give you an image of a dog, it needs to do two things:

  1. Understand what your prompt means

  2. Turn the prompt into an image

It uses the same techniques as ChatGPT to understand your prompts. Once it understands you want a dog, it uses a technique called Diffusion to generate the image (Diffusion is too long a complicated to explain here).

Again, this is a cute explanation that has a lot of real-world implications. Just this week, pornographic AI images of Taylor Swift were released on the internet. With over 30% of the world voting in elections this year, we’ll leave you to think about the political implications of AI images.

Beyond the event horizon 

How do we tie the bow here? Generative AI is a vertical of artificial intelligence that brings together multiple types of models and training techniques. It is just one category under the umbrella of AI that has risen to fame in the past 18 months. 

There are obvious limitations but the rate of improvement is incredibly fast. We are not far from LLM’s having the ability to reason and do more complex Maths. Another way to think about it, as Sam Altman (CEO of OpenAI) recently mentioned on stage, is what % of human tasks can ChatGPT complete. Is it 10% right now? With the release of the GPT-5 maybe it goes to 15%? 

At some point, when it surpasses X% (maybe 51%), will we then have superhuman intelligence? Will it be what most people call, artificial general intelligence (AGI)? We can’t see past that event horizon, what we do know is that we are moving incredibly quickly, we have been in a moment like this before with the invention of the wheel, electricity, and the internet.

This will be the next big thing so keep an eye on it.