(Note: This post comes in both video and text form - enjoy it any way you’d like!)
In order to understand how AI impacts you, and how you can utilize AI for your benefit (or run away and hide from it) - it's crucial to understand how AI actually works. Having the right mental model about how this stuff works is key to having any conversation of significance about AI. It will frame for you how to use AI correctly and have a good grasp on what it is and isn't capable of. So, let's dive in!
AI appears, on its surface, to be something super complex. Like, what does it even mean that a computer can possess intelligence? If it's only a hunk of metal, how can it think?
However, when we pull the covers off AI, we'll see that although it certainly has complexity to it, it really boils down to a simple idea. A bit like the boogeyman under the bed, we imagine it to be a lot more than it actually is. Once we turn on the lights, we can see that it's just... some sneakers.
Now, in this post I'm going to focus on one particular form of AI - and that is the technology known as a Large Language Model - usually referred to by its acronym, "LLM." LLMs are the most prevalently known and talked about forms of AI at the time of this post, and this is because ChatGPT - which is the product that ushered in the new era of AI - is itself an LLM.
If you haven't seen ChatGPT in action yet, I highly recommend you sign up for an account. There's a free plan, and you don't have to put in any credit card info, so you don't have anything to lose.
In the future, I'll post a lot about how to best use ChatGPT and what it's good for, but for now, let me just show you a simple demo of working with it.
When we start ChatGPT up, it asks in a most friendly way, "What can I help with?"
And you can literally ask ChatGPT anything. You may not get the answer you were hoping for, but you can certainly ask it anything you can dream up. Let's ask it, "Help me explain to my mother why I shouldn't eat Brussels sprouts."
This text that I type into ChatGPT is known as a prompt.
Here's the result (in part):
There’s some clever stuff here!
Now, ChatGPT isn't simply acting as a search engine and looking up your question and spitting back the answer. There’s a lot of text in the result here that if we try to search for it on Google, we won’t necessarily find it anywhere on the Internet. Rather, ChatGPT is generating its response up from scratch.
How does ChatGPT come up with this stuff - and so quickly?
When we pull the covers off of LLMs - we'll find that they are just a fancy - albeit super fancy - form of autocomplete.
You may be familiar with autocomplete features. When it comes to texting on a phone, for example, your phone may suggest the next word that you intend to type:
Here, I typed, "Hey how's it..." and the phone suggests for us some words that we're likely to type in next. The words it suggests are pretty reasonable: "going", "looking", and "been".
Most of us are kind of used to this feature, so it doesn't strike us as particularly amazing. But if you think about it, how does this work, exactly? The phone isn't reading our mind, of course, so how does it suggest these autocomplete words?
The answer is that the phone has access to data of other text conversations (hopefully, not breaking any privacy laws or anything) - and can figure out statistically what word is likely to come next. Our phone may have seen numerous texts that contain the words "Hey, how's it..." within the text, and can see that, historically, the three most likely words to come next are "going", "looking", and "been."
Now, perhaps the reason we're not so impressed with our phone's autocomplete feature is because it's not so great; that is, very often, it misses the mark of what we wanted to type next. If you want to see just how off the mark your phone can be, try typing a word or a phrase into your phone, and then keep hitting one of the autocomplete buttons. Don't do any more typing; just keep mashing the autocomplete button and see what text comes out.
Here's one text - where I started by typing “Are you serious?” I then kept mashing the middle autocomplete suggestion and here’s what came out:
And yep, that’s totally what I meant to say.
Here's another example of a phone “fail.” When I type, "Mary had a little..." - my phone offers me the following autocomplete suggestions: "bit", "more", and "too." I would’ve thought that one of the suggestions would be “lamb”! To be fair, though, I'm not sure how often people text nursery rhymes to each other.
However, let's see what happens when we put the same prompt into ChatGPT. Let's type, "Please complete the following sentence: `Mary had a little...`
This time, I get the result I was hoping for: "...lamb, its fleece as white as snow.” (Plus, some other weird stuff.)
Now, here's the reason why ChatGPT works so much better than my iPhone. I'm not sure what data my phone is using to predict the next word, but ChatGPT is using... like, the entire Internet. In other words, ChatGPT has read - okay, perhaps not the entire - but a huge portion of the Internet. On top of that, it also read libraries of books, and a ton of other info as well.
Based on everything that ChatGPT has read, it now uses all that data to predict my next word. More specifically, ChatGPT does the following:
To use a simplified example: Say that ChatGPT has seen nine different websites that contain the "Mary had a little lamb" nursery rhyme. But, perhaps, it also read some chat history which said, "Mary had a little too much to eat."
So, now, reasons ChatGPT, when a user asks it to guess the next word after "Mary had a little..." - based on the statistics of past data, there's a 90% chance that the next word is "lamb," and a 10% chance that the next word is "too." And so, ChatGPT goes with the odds and supposes that the next word should be "lamb."
Note that ChatGPT isn't looking only at the word "little" and guessing what word comes next. After all, if in some other context I wrote the word "little," it's unlikely that the next word would be "lamb." Rather, ChatGPT is taking the entire context into consideration. That is, when taking into the context the entire phrase of "Mary had a little..." - then it's likely that the next word will be "lamb."
And that's it. This is how ChatGPT and all other LLMs work. Sure, there's more nitty-gritty detail to it, but at its core, an LLM is just an autocomplete machine that uses other data - such as previously written books and articles - to statistically predict the next word. What makes ChatGPT so powerful, though, is the fact it has seen so much data. After you've read, say, the entire Internet, you can have a really good sense of what the statistics are.
Now, the example I just used most recently was one where I asked ChatGPT to perform an autocomplete. That is, I explicitly asked ChatGPT to predict the next word (of “Mary had a little”). But in truth, when we simply chat with ChatGPT, it's doing the exact same thing: it's using autocomplete to complete its sentences and paragraphs.
Let's jump back to our first chat example, where we asked ChatGPT to help me craft an explanation for not eating Brussels sprouts. What's going on here under the hood is also autocomplete. However, instead of me explicitly asking ChatGPT to complete a sentence, ChatGPT is autocompleting what would come next after my prompt to it.
Here’s what I mean.
Imagine that somewhere on the Internet there was a chat where one person posted, "Help me explain to my mother why I shouldn't eat Brussels sprouts." Someone else may have then replied by saying, "Explaining your dislike for Brussels sprouts..." and so on. And so, ChatGPT is simply autocompleting my conversation with it - patterned off of that same conversation on the Internet.
Now, it's not necessarily the case that there ever was - in the history of the Internet - an actual conversation such as this. However, there is enough similar content on the Web out there so that ChatGPT is able to extrapolate and still guess statistically what words should come next in our chat conversation. Like I said, there are more details to all of this, but the guts of what's going on inside ChatGPT is that it's - at its core - an autocomplete engine. And again, the reason why this engine is so powerful is because it has read so much data.
Another useful piece of jargon to know - is that when we say that an LLM has read data, we say that the LLM was trained on that data. To put this into a sentence: "ChatGPT was trained on so much data, including a large portion of the Internet plus many other books and articles."
Now, the knowledge that LLMs work as a statistically-based autocomplete engine is super important because this knowledge will inform us what to expect from an LLM as well as to how best use an LLM.
For example, I'll discuss one major ramification of understanding how LLMs work. And that is, that since ChatGPT has been trained on a big bulk of the web - well, there's some good writing on the web, and there’s also some pretty terrible writing. The same goes for books.
So, say that I ask ChatGPT to write a brand new novel. Perhaps I'll prompt it with: "Please write a young-adult fantasy novel, weaving magic, mystery, and a journey of self-discovery."
What ChatGPT is now going to do is create a novel that is the average of all novels in this genre. It's not going to write a creative story with breakthrough ideas. It'll write a book that will be the along the lines of the average mush of all novels - good and bad - mashed together. Or, it might write a book that is suspiciously very, very similar to Harry Potter.
Now, I'm not saying that you shouldn't use ChatGPT to help you write a novel. Armed with the right set of tricks and tips, ChatGPT can be immensely helpful in such an endeavor. However, based on your newfound knowledge about how LLMs work, you now know that ChatGPT won't output a great novel when we prompt it simply in this way.
My future posts will cover how to best use ChatGPT and other LLMs, and this post serves as the foundation for all that exciting info that is yet to come.
Stay tuned!
-Jay
For further reading: What is ChatGPT doing and why does it work?
(P.S. All the images and background music featured in the video post were generated with ChatGPT and other forms of AI! This saved me a ton of time.)
Share this post