DeepMind's New Approach to Avoiding Hallucinations in Large Language Models - Erika Barker — Miami-Based AI, Data Science, and Tech Journalism

In case you’re in a hurry

Large Language Models (LLMs) are impressive at generating text but sometimes produce nonsense, known as “hallucinations.”
Researchers at DeepMind have developed a new method to reduce these errors by enabling LLMs to self-evaluate their responses.
The technique, called conformal abstention, allows the LLM to say “I don’t know” when unsure about an answer.
This method has been tested and shown to effectively reduce hallucinations on datasets with both short and long responses.

Tackling Hallucinations in Large Language Models: DeepMind’s New Approach

So, I admit I am shamelessly writing this article after reading my daily go-to, TechXplorer.com (which you should add to your daily reading list, by the way.) So, let’s talk about the buzz of the town, Large Language Models (LLMs). In case you don’t know, it’s how nifty AI systems like Open AI’s Chat-GPT and Google’s Gemini (Formerly Bard) can churn out text like nobody’s business, from answering your deepest questions to generating the next chapter of your favorite book (What really did happen to Jon Snow if we pretend the last season of Game of Thrones never existed?). But as cool as they are, they’ve got a bit of a hiccup—they sometimes hallucinate. No, not the psychedelic kind, but the kind where they produce answers that are total nonsense or plain wrong. NONSENSE, I TELL YOU!

The Hallucination Problem

Imagine asking your digital assistant a straightforward question, and it starts spouting gibberish, or a politician at a town hall… actually better not touch that one. That’s what happens when LLMs hallucinate. These models, though brilliant, sometimes get their wires crossed and give us answers that don’t make sense. It’s like asking your friend for directions, and instead of telling you how to get to the grocery store, they start describing how the moon is flat and how they should be on the Joe Rogan podcast.

Enter DeepMind’s Solution

The brainiacs over at DeepMind have been working on a solution to this pesky problem. They developed a method to help LLMs recognize when they might be about to hallucinate and choose to keep quiet instead. Think of it as the AI equivalent of knowing when to say, “I don’t know,” instead of making things up.

How It Works

DeepMind’s new method uses something called conformal abstention. Here’s a breakdown:

Self-Evaluation: The LLM checks its own responses for consistency. If it gets multiple different answers for the same question, that’s a red flag.
Similarity Scoring: The model evaluates how similar its various responses are to each other. If they’re wildly different, the model might decide not to answer.
Conformal Prediction: This technique gives the model theoretical guarantees on its error rates, helping it to decide when to abstain from answering.

In practice, this means that the LLM will only answer when it’s pretty confident it’s got the right answer. Otherwise, it’ll politely refrain from responding, which is a lot better than giving you a made-up answer.

Putting It to the Test

The researchers tested their method on datasets like Temporal Sequences and TriviaQA, using a model called Gemini Pro (developed by Google). The results were promising:

On datasets with short answers, like TriviaQA, the method worked as well as existing uncertainty measures.
On datasets with long, complex responses, the new method was much better at knowing when to abstain.

Why This Matters

This development is a big deal for anyone using LLMs, from tech companies to educators. By reducing the rate of hallucinations, these models become more reliable and trustworthy. Imagine a future where your AI assistant can confidently handle even more complex tasks without slipping into nonsensical territory.

So DeepMind’s new approach to handling LLM hallucinations is a step forward in making these models more dependable. It’s like teaching them a bit of humility—knowing when it’s okay to say, “I don’t know,” can be just as important as having all the answers.

DeepMind’s New Approach to Avoiding Hallucinations in Large Language Models