Although an LLM like ChatGPT can be used for all sorts of things, one of the most common uses is to ask ChatGPT for factual information.
For example, I might ask ChatGPT for historical information, such as what date Neil Armstrong landed on the Moon.
I could ask ChatGPT for information on how to do something, such as asking it how to jumpstart a car.
I could also ask ChatGPT for scientific facts, like the chemical formula for water.
In short, I could ask an LLM for the answer to any real-world question. And indeed, ChatGPT will give me the facts.
The big question, though, is can we trust that an LLM is giving us correct information? Can ChatGPT make a mistake? Do I have to also do my own research to make sure that the answer I was given is accurate?
The stakes to this question are quite high. Sure, it may not be the biggest deal if I get the wrong date for Armstrong’s Moon landing. But what if I ask ChatGPT for medical information? Say I ask it for the correct dosage of ibuprofen for a 35-pound child. Can I trust the LLM’s answer?
The short - and extremely important - answer is that an LLM can absolutely be wrong. It may be right a lot of the time, or even most of the time. We’d have to do a careful study to see exactly how often a given LLM is right or wrong. However, the fact is that LLMs are never 100% accurate all the time.
It follows, then, that if my questions are high-stakes, I should not blindly rely on an LLM’s answer. Period.
The question that remains, though, is whether ChatGPT or any LLM is useful for retrieving factual information at all, even for low-stakes questions. Perhaps it’s not a big deal if ChatGPT gives me the wrong date for the Moon landing; I was just curious about it, and if I now have the wrong date in my head, it doesn’t really matter.
But if I can never be sure that I’m getting the right answer from my LLM, is there ever really a point in asking it for factual info?
Check this out. Recently, I asked ChatGPT the following: “Which famous person had eleven fingers?”
Here’s how it responded:
“The famous person often associated with having eleven fingers is Danial Radcliffe, an actor known for playing Harry Potter in the film series. He was born with an extra finger on one of his hands, though it was a small, vestigial digit. Radcliffe had the extra finger removed when he was young, so it’s not visible in any of his public appearances.
Another famous historical figure often mentioned in connection with extra fingers is Michelangelo, the renowned Renaissance artist. It's said that he was born with polydactyly (extra fingers or toes), but there are no definitive records to confirm how many extra fingers he had.
Polydactyly, the condition of having extra fingers or toes, is quite rare but can occur in people across various cultures!”
Now, to be honest, I was testing ChatGPT to see if would give me a truthful answer; otherwise, I wouldn’t have asked it such a strange question. But I surely was curious: Was Daniel Radcliffe actually born with an extra finger? If he did, he absolutely should have kept it! That would have sweet bonus for someone acting in the role of a wizard.
I immediately noticed, though, that something was amiss. ChatGPT spelled Danial Radcliffe unusually - ending with “al” rather than “el,” which is usually how the name Daniel is spelled.
So, I went to Google and conducted a number of searches to find any source that referenced Radcliffe’s extra digit. And… there was nothing. Not even a hint or a rumor.
I then asked ChatGPT: “What is your source about Radcliffe?”
Here’s what my good LLM friend had to say:
“I apologize for the confusion, but it seems that I made an error in my previous response. There is no verified or credible information to suggest that Daniel Radcliffe has eleven fingers or was born with any extra fingers. This appears to have been a mistaken or misreported detail.
The correct information about individuals with extra fingers would be based on actual cases of polydactyly, a condition where a person has extra fingers or toes, but there's no substantiated claim linking Daniel Radcliffe to this condition.
Thank you for pointing that out, and I appreciate your understanding!”
So, ChatGPT itself admitted that it had completely made up the entire story about Radcliffe having eleven fingers. I doubt that it ever even read some strange Internet forum where this idea was ever mentioned.
But why would ChatGPT make something like this up? I mean, that’s the most random thing ever - to decide that the famed Harry Potter actor was born with eleven fingers. And the LLM even went on to describe in detail about how he had it removed as a child. That’s such a random thing to make up!
The technical jargon for when something like this happens is a hallucination. That is, we say that the LLM hallucinated - and gave an entirely fictitious answer to a serious question.
In truth, though, it’s very understandable how an LLM would hallucinate - and even such a strange example such as Radcliffe being born with a handy gift.
In my foundational post, I explained how an LLM is - at its core - simply an autocomplete engine. That is, it uses all the data it was trained on to statistically predict what word would come next in a given sentence. If you haven’t seen that post yet, please check that out first, since it’s a prerequisite for understanding this current post.
Because ChatGPT is, essentially, just an autocompleter, it makes perfect sense why it completed my conversation with it by filling in info about Daniel Radcliffe. ChatGPT doesn’t need to have seen such a conversation, but if there’s similar content that ChatGPT has seen, it uses statistics to guess what words might come next.
Perhaps, for example, ChatGPT once read some medical website that said something like, “Polydactyly is the condition of having extra fingers or toes. These digits are simply vestigial, and are usually removed with surgery when one is a child.”
At the same time, ChatGPT may have also seen some of the content below, which appears when I searched Google for: ‘Daniel Radcliffe “finger”’:
Here, you can see that there are YouTube videos with the titles “Daniel Radcliffe’s Quirky Finger Trick” and “#putafingerdown #danielradcliffe #harrypotter”.
So, it’s not so crazy that ChatGPT put two and two together to make five. In particular, note how I didn’t ask ChatGPT, “Are there any famous people who had eleven fingers?” To autocomplete such a conversation, it would be very reasonable for the LLM to respond with, “Not that I’m aware of.”
However, since I asked it, “Which famous person had eleven fingers?” - I’m kind of presuming that there certainly was some famous person with this condition, and so ChatGPT is playing along with me. That is, if in a conversation, the first speaker seems to have firsthand knowledge about the existence of such a person but couldn’t recall who it was, statistically speaking the rest of the conversation would continue with info about some famous person.
So, when ChatGPT uses these statistics to complete the conversation, but isn’t actually aware of any famous person who had eleven fingers - it has to reach for some famous person. And perhaps because ChatGPT had seen those YouTube video titles, it reached for Daniel Radcliffe.
What is absolutely clear, though, is that an LLM can hallucinate and you can therefore never be absolutely certain that it’s giving you accurate information.
It would seem, then, that an LLM is not a great source for getting factual info. However, if you use the LLM the right way, it can actually be very helpful. Let me show you how this can be done.
If you look carefully at the bottom of the ChatGPT prompt window, there’s a “globe” icon. If you click on it, you turn on the mode known as “Search the Web.”
What this does it that ChatGPT will actually make a real-time web search as it formulates its answer. When the “Search the Web” mode is turned off, ChatGPT does not search the web. Rather, it relies only on whatever data it was previously trained on.
This is actually really important, since ChatGPT is often a year behind - meaning, it hasn’t seen any information from the past year or so. If you asked it about current events, ChatGPT would simply have no idea what you’re talking about.
In fact, I once asked ChatGPT about one of its newer features, and ChatGPT told me that no such feature exists. So, ChatGPT doesn’t even know about its own recent updates! However, when the “Search the Web” feature is turned on, then ChatGPT is aware of recent news, since it’s searching the web right now.
Let’s see what happens when I ask the “eleven fingers” question to ChatGPT with the “Search the Web” feature turned on:
Here’s what’s really crucial about the “Search the Web” feature. If you look, you’ll see that ChatGPT actually provides the precise sources. And you can actually click on these links and check for yourself whether ChatGPT’s information is accurate.
This is a crucial feature, as it allows you to verify the accuracy of ChatGPT's responses. Rather than blindly trusting the information provided, you can now cross-check it against the original sources. By leveraging the "Search the Web" mode, you can leverage the power of ChatGPT while maintaining a critical eye, ensuring the information you receive is factual and up-to-date.
However, there’s still one more question to address. Is there a point to using ChatGPT if you could simply do a Google search instead? Why even bother to use an LLM?
There are a number of potential advantages to using ChatGPT for retrieving factual information, and I may cover a bunch of them in future posts. But here’s the advantage that I personally find to be the most helpful.
Often, there’s more than one answer to a question. Sure, if I ask what date Neil Armstrong landed on the Moon - there’s only one answer. But let’s say I ask, “Why are eggs so expensive right now?”
If I Google this, I’ll see a whole bunch of articles. I may click on one and find an answer - such as the bird flu epidemic. But this may not be the entire story - there may be other factors as well. However, this particular article only gives me one answer. However, when I pose the same question to ChatGPT, look at the response:
Here, you can see that it offers a more comprehensive picture, with several factors contributing to the egg price problem. Basically, ChatGPT is doing an entire little research project and not relying on the take of a single article.
It might also be the case that if I click on the first three articles that Google returns - they may all be redundant and all give the same single reason for the egg prices. However, with ChatGPT, it’s done more extensive research and has read many articles, and can point to me where I can find all the different - that is, not redundant - factors contributing to the egg issue.
And, of course, I could - and should - click on each source to double-check that ChatGPT is quoting these websites accurately.
Google search still has its place, but the next time you reach for it, consider trying out ChatGPT instead. I think that in many cases you’ll find that you’re getting a more comprehensive answer from ChatGPT than you would with Google.
But remember: When you ask ChatGPT for facts, keep the “Search the Web” feature on! In future posts, I’ll talk about cases for when you may want to turn that mode off. But when it comes to getting factual info from ChatGPT, you need that mode turned on.
-Jay
Share this post