Are AI Hallucinations Still a Problem in 2024?

AI models have major trust issues. The tendency of chatbots like ChatGPT to produce verifiably false outputs does not make it easy for users to trust the content they create.

This tendency has created quite a few PR nightmares — just recently, Google’s search AI recommended that users use glue to prevent cheese from sliding off of pizza.

And a surprise ruling by Gemini-forerunner Google Bard that the James Webb Space Telescope was the first to take a picture of a planet outside Earth’s solar system cost Alphabet $100 billion in market value.

These kinds of incidents show that you can’t rely on large language models (LLMs) to generate accurate outputs – and have seriously hampered the usefulness of these tools as research assistants.

After all, if you need to double-check every output, you may as well just use a traditional search engine in the first place.

But just how common are hallucinations in 2024? Despite significant advancements in LLM technology, today’s models are still generating incorrect and untrue information. Now let’s look at why this matters.

Key Takeaways

Hallucinations are one of the biggest problems with LLM-driven chatbots like ChatGPT.
When a model hallucinates, it makes up information — and the only way to spot it is to verify and fact check answers yourself.
89% of ML engineers report that their LLMS exhibit signs of hallucinations.
Users need to be aware of hallucinations, or they could be misled by misinformation.

Table of Contents Table of Contents

Key Takeaways
Why Do AI Hallucinations Matter?
What Causes AI Hallucinations?
How Users and AI Vendors Can Address Hallucinations
The Bottom Line
References

Table of Contents

Key Takeaways
Why Do AI Hallucinations Matter?
What Causes AI Hallucinations?

Show Full Guide

How Users and AI Vendors Can Address Hallucinations
The Bottom Line
References

Why Do AI Hallucinations Matter?

Generative AI-driven chatbots can be powerful tools, but they often spread misinformation. In fact, a study released earlier this year by Aporia suggests that 89% of machine learning engineers report their LLMs exhibit signs of hallucinations.

Many AI vendors are also open about the risks of hallucinations in their products. For instance, OpenAI notes that while ChatGPT can produce human-like responses, it can also produce outputs that are “inaccurate, untruthful, and otherwise misleading.”

The prevalence of these hallucinations means that users can’t afford to trust them as sources of information.

Joseph Regensburger, VP of Research at Immuta, told Techopedia, told Techopedia:

“Generative AI works as a probability chain, and it does a very good job at delivering strong output when it’s tied to tangible and accurate data.

“But, if it’s not tied to tangible and accurate data, it will “hallucinate” or produce fictional output that actually looks very believable.

“That’s why at least for the foreseeable future, AI will and should be more of a human aid rather than a hands-off replacement. Until the data AI uses has better processes and policies that ensure its quality and accuracy, it cannot be hands-off.”

What Causes AI Hallucinations?

The reasons for hallucinations are complex. But a simple answer is that it comes down to the quality of the model’s training data. If the model’s training data is incomplete or limited then this can limit the contextual understanding of the language model and create biases.

Sidrah Hassan, an AI ethicist at AND Digital, told Techopedia:

“AI hallucinations can arise from various factors, and their prevalence is likely to increase as the GenAI landscape expands. Key contributors to these hallucinations include data bias, lack of data context, and overfitting [placing high importance on relatively inconsequential data].

“Data bias can lead to the generation of incorrect or misleading results, as AI algorithms may inadvertently pick up on inaccuracies within the training data.”

Hassan notes that lack of data context can result in unreliable outputs, limiting a language model’s contextual understanding — an issue which can be exacerbated by overfitting, as an AI model incorporates irrelevant patterns.

“The impact of these hallucinations on the reliability of search results is substantial, particularly considering that search engines rank as the third most-visited sites on the internet.

“The infamous example is Google’s Gemini recommending putting glue on a pizza to make the cheese stick. Such instances suggest a need for more diligent oversight and proactive measures in the development of AI search engines.”

How Users and AI Vendors Can Address Hallucinations

At a high level, addressing hallucinations is simple; users need to proactively fact check. In practice, this means double-checking every claim made by an LLM against a reliable third party source.

While this is inconvenient, some tools like Perplexity AI seek to make this process simpler by providing users with citations to external sites.

Users can also attempt to reduce the risk of hallucinations by being more specific with their prompts and instructions. Aim to provide as much context as possible in your prompts to reduce the chance of the language model filling in the blanks.

On the vendor side, there are no hard and fast answers, but focusing on improving the quality of training data is a great place to start. This could be as basic as training models on highly-curate datasets that have minimal factual inaccuracies.

Another way to reduce the likelihood of hallucination is to use techniques like retrieval augmented generation (RAG). In this technique, a model checks its training data and an external knowledge base before responding to a prompt, which can help increase the reliability of its responses and ensure they better align with the user’s needs.