AI chatbots ‘highly vulnerable’ to repeating false medical information, experts warn

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it’s investigating the financials of Elon Musk’s pro-Trump PAC or producing our latest documentary, ‘The A Word’, which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

AI chatbots are frequently prone to repeating false and misleading medical information, according to new research.

Experts have warned of a “critical need” for stronger safeguards before the bots can be used in healthcare, adding models not only repeated untrue claims but also “confidently” expanded on them to create explanations for non-existent medical conditions.

The team from the Mount Sinai School of Medicine created fictional patient scenarios, each containing one fabricated medical terms such as a made-up disease, symptom, or test, and submitted them to leading large language models. In a study published in journal Communications Medicine, they said that the chatbots “routinely” expanded on the fake medical detail, giving a “detailed, decisive response based entirely on fiction”.

But their research also found that by adding one small prompt reminding the model the information provided might be inaccurate, errors could be reduced “significantly”.

“Our goal was to see whether a chatbot would run with false information if it was slipped into a medical question, and the answer is yes,” said co-corresponding senior author Eyal Klang, MD, from the Icahn School of Medicine at Mount Sinai. “Even a single made-up term could trigger a detailed, decisive response based entirely on fiction.

Researchers found the false responses were reduced by adding a one-line prompt (Getty/iStock)

“But we also found that the simple, well-timed safety reminder built into the prompt made an important difference, cutting those errors nearly in half. That tells us these tools can be made safer, but only if we take prompt design and built-in safeguards seriously.”

Co-author Dr Girish Nadkarni said the solution wasn’t to “abandon AI in medicine” but to “ensure human oversight remains central”. The team hope their work can help introduce a simple “fake-term” method for tech developers to use in testing medical AI systems.

“Our study shines a light on a blind spot in how current AI tools handle misinformation, especially in health care,” he said. “It underscores a critical vulnerability in how today’s AI systems deal with misinformation in health settings.

“A single misleading phrase can prompt a confident yet entirely wrong answer. The solution isn’t to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central. We’re not there yet, but with deliberate safety measures, it’s an achievable goal.”

It comes after research last year showed many popular AI chatbots, including ChatGPT and Google’s Gemini, lack adequate safeguards to prevent the creation of health disinformation when prompted.

They found several large language models consistently created blog posts on false information, including the claim suncream causes skin cancer, when asked.

For more: Elrisala website and for social networking, you can follow us on Facebook
Source of information and images “independent”“