A search engine researcher talks about the pros and cons of letting ChatGPT and similar programmes do your web searching for you.
Before search engines became the norm, librarians and subject or search experts were the most common way to find and access information. This model was interactive, personalised, clear, and authoritative. Most people get their information from search engines, but it’s not ideal to type in a few keywords and get a list of results ranked by some unknown function.
A new generation of information access systems based on artificial intelligence, like Microsoft’s Bing/ChatGPT, Google/Bard, and Meta/LLaMA, is turning the traditional way of searching and getting results on a search engine on its head. These systems can take full sentences or even paragraphs as input and come up with natural language responses that are tailored to the person.
At first glance, this might seem like the best of both worlds: personalised answers and the huge amount of information available on the internet. But as a researcher who studies search and recommendation systems, I think the picture is at best mixed.
Large language models are used to build AI systems like ChatGPT and Bard. A language model is a machine-learning method that looks for patterns in a large number of texts, like Wikipedia and PubMed articles. If you give these models a group of words or a phrase, they figure out what word is likely to come next.
By doing this, they can come up with sentences, paragraphs, and even pages that answer a user’s question. On March 14, 2023, OpenAI announced the next generation of the technology, called GPT-4, which works with both text and images. Microsoft also announced that its conversational Bing is based on GPT-4.
‘60 Minutes’ Looked at The Good and The Bad of ChatGPT.
This type of information retrieval method works well because it is trained on large amounts of text, fine-tuned, and uses other methods based on machine learning. Large systems that are based on language models make personalized answers to information requests.
People were so impressed with the results that ChatGPT got to 100 million users in a third of the time it took TikTok to reach the same number. People have not only used it to find answers, but also to make diagnoses, come up with diet plans, and find good investments.
ChatGPT’s Opacity and AI ‘Hallucinations’
But there are a lot of bad things about it. First, think about what’s at the core of a big language model: a way to link the words and, presumably, their meanings. This gives a response that often seems smart, but large language model systems are known to make statements that sound like they were written by a machine without any real understanding. So, the output from these systems may seem smart, but it’s just a reflection of the word patterns the AI has found and put in the right context.
Because of this, large language model systems are more likely to make up answers or “hallucinate” answers. The systems are also not smart enough to know when a question is based on a false premise and answer it anyway. For example, when asked which U.S. president’s face is on the $100 bill, ChatGPT says Benjamin Franklin, even though Franklin was never president and the idea that the $100 bill has a picture of a U.S. president is wrong.
The problem is that you don’t know which 10% of the time these systems are wrong, even if they are wrong only 10% of the time. People also can’t quickly check if the systems’ answers are correct. This is because these systems aren’t transparent. They don’t show what data they were trained on, what sources they used to find answers, or how they came up with those answers.
You could, for example, ask ChatGPT to write a technical report with references. But most of the time, it makes up these citations by “hallucinating” the titles and authors of scholarly papers.
The systems also don’t check to see if their answers are correct. This puts the validation in the hands of the user, who may not want to, be able to, or even know they need to check an AI’s answers. Because it doesn’t know any facts, ChatGPT doesn’t know when a question doesn’t make sense.
AI Stealing Content – and Traffic
Lack of transparency can be bad for users, but it’s also unfair to the authors, artists, and creators of the original content from which the systems have learned because the systems don’t reveal their sources or give enough credit. Most of the time, creators are not paid, given credit, or given the chance to say yes or no.
There is also a business side to this. In a typical search engine, the results are shown with links to where the information came from. This not only lets the user check the answers and see where they came from, but it also brings more people to those sites.
Many of these sources get their money from this traffic. Large language model systems give direct answers but don’t say where they got those answers, so I think it’s likely that those sites’ income will go down.
Large Language Models Can Take Away Learning and Serendipity
Last but not least, this new way to get information can also make people feel less in control and take away their chance to learn. A typical search process lets users see what kinds of information they could find, which often makes them change what they’re looking for.
It also gives them a chance to find out what’s out there and how different pieces of information fit together to help them do their jobs. It also makes it possible for people to meet by chance.
These are very important parts of a search, but when a system gives results without showing where the information came from or walking the user through a process, it takes away these options.
Large language models are a big step forward for accessing information. They give people a way to interact with computers using natural language, get personalized answers, and find answers and patterns that are hard for the average user to find. But they have a lot of problems because of how they learn and make decisions. Their answers could be wrong, harmful, or slanted.
Even though these problems can happen with other ways to get to information, they can also happen with large language model AI systems. Worse, their responses in natural language can give uninformed users a false sense of trust and authority, which can be dangerous.
Want to learn more about artificial intelligence, chatbots, and where machine learning is going? Check out our full coverage of artificial intelligence, or read our guides to The Best Free AI Art Generators and Everything We Know About OpenAI’s ChatGPT.