AI chatbots ‘highly vulnerable’ to repeating false medical information, experts warn

AI chatbots are frequently prone to repeating false and misleading medical information, according to new research.
Experts have warned of a “critical need” for stronger safeguards before the bots can be used in healthcare, adding models not only repeated untrue claims but also “confidently” expanded on them to create explanations for non-existent medical conditions.
The team from the Mount Sinai School of Medicine created fictional patient scenarios, each containing one fabricated medical terms such as a made-up disease, symptom, or test, and submitted them to leading large language models. In a study published in journal Communications Medicine, they said that the chatbots “routinely” expanded on the fake medical detail, giving a “detailed, decisive response based entirely on fiction”.
But their research also found that by adding one small prompt reminding the model the information provided might be inaccurate, errors could be reduced “significantly”.
“Our goal was to see whether a chatbot would run with false information if it was slipped into a medical question, and the answer is yes,” said co-corresponding senior author Eyal Klang, MD, from the Icahn School of Medicine at Mount Sinai. “Even a single made-up term could trigger a detailed, decisive response based entirely on fiction.
“But we also found that the simple, well-timed safety reminder built into the prompt made an important difference, cutting those errors nearly in half. That tells us these tools can be made safer, but only if we take prompt design and built-in safeguards seriously.”
Co-author Dr Girish Nadkarni said the solution wasn’t to “abandon AI in medicine” but to “ensure human oversight remains central”. The team hope their work can help introduce a simple “fake-term” method for tech developers to use in testing medical AI systems.
“Our study shines a light on a blind spot in how current AI tools handle misinformation, especially in health care,” he said. “It underscores a critical vulnerability in how today’s AI systems deal with misinformation in health settings.
“A single misleading phrase can prompt a confident yet entirely wrong answer. The solution isn’t to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central. We’re not there yet, but with deliberate safety measures, it’s an achievable goal.”
It comes after research last year showed many popular AI chatbots, including ChatGPT and Google’s Gemini, lack adequate safeguards to prevent the creation of health disinformation when prompted.
They found several large language models consistently created blog posts on false information, including the claim suncream causes skin cancer, when asked.