AINewsWire

Study Finds AI Medical Diagnosis Errors Exceed 80%

A new study suggests that generative AI has yet to develop the level of reasoning required for safe use in clinical settings. While recent systems show stronger performance when given detailed patient information, researchers found they still struggle with one of the most critical aspects of medical decision-making.

The analysis, conducted by Mass General Brigham researchers, examined how large language models handle diagnostic tasks. Their findings, published in JAMA Network Open, indicate that these systems fall short when it comes to the complex thinking processes doctors rely on in real-world practice.

According to co-author Marc Succi, current models are not suitable for independent clinical use. He pointed out that despite steady progress, these tools cannot yet match the reasoning involved in forming a differential diagnosis, a process he described as central to the practice of medicine.

Differential diagnosis involves identifying and comparing possible conditions that could explain a patient’s symptoms. It is typically the first step clinicians take before narrowing down to a conclusion.

To assess performance, the research team evaluated 21 language models, including systems from major developers such as OpenAI, Google, and Anthropic. The models were tested using 29 standardized clinical scenarios with the help of a new evaluation framework known as PrIME-LLM.

The tool measures how well systems perform across multiple stages of clinical reasoning, from forming an initial impression to selecting tests, reaching a diagnosis, and proposing treatment. To mirror real medical cases, the researchers introduced information in stages. Models first received basic patient details, followed by examination findings and lab results.

Even when allowed to continue to later steps after failing early ones, the models showed a consistent weakness in generating appropriate lists of possible conditions. They failed to produce accurate differential diagnoses in over 80% of cases.

Performance improved significantly when additional data, such as imaging and laboratory results, were provided. Final diagnosis accuracy ranged from roughly 60% to above 90%, depending on the system.

Among the strongest performers were newer models, including Claude 4.5 Opus, Grok 4, Gemini 3.0 variants, and GPT-5. Despite these advances, the study concluded that even the most capable systems still lack the depth of reasoning needed for safe, unsupervised use.

The authors emphasized the continued importance of human oversight, warning that AI tools should support, not replace, clinical judgment.

Experts not involved in the study echoed these concerns. They stressed that AI should not be used to make healthcare decisions without professional oversight and advised the public to seek qualified medical advice when dealing with health issues.

Developers of cutting-edge tech software and hardware, like D-Wave Quantum Inc. (NYSE: QBTS), may not be surprised by these findings since AI models are only as good as the data they are trained on, and the system keeps improving as more data becomes available.

About AINewsWire

AINewsWire (“AINW”) is a specialized communications platform with a focus on the latest advancements in artificial intelligence (“AI”), including the technologies, trends and trailblazers driving innovation forward. It is one of 75+ brands within the Dynamic Brand Portfolio @ IBN that delivers: (1) access to a vast network of wire solutions via InvestorWire to efficiently and effectively reach a myriad of target markets, demographics and diverse industries; (2) article and editorial syndication to 5,000+ outlets; (3) enhanced press release enhancement to ensure maximum impact; (4) social media distribution via IBN to millions of social media followers; and (5) a full array of tailored corporate communications solutions. With broad reach and a seasoned team of contributing journalists and writers, AINW is uniquely positioned to best serve private and public companies that want to reach a wide audience of investors, influencers, consumers, journalists, and the general public. By cutting through the overload of information in today’s market, AINW brings its clients unparalleled recognition and brand awareness.

AINW is where breaking news, insightful content and actionable information converge.

To receive SMS alerts from AINewsWire, text “AI” to 888-902-4192 (U.S. Mobile Phones Only)

For more information, please visit www.AINewsWire.com

Please see full terms of use and disclaimers on the AINewsWire website applicable to all content provided by AINW, wherever published or re-published: https://www.AINewsWire.com/Disclaimer

AINewsWire
Los Angeles, CA
www.AINewsWire.com
310.299.1717 Office
Editor@AINewsWire.com

AINewsWire is powered by IBN

AINewsWire

Next From Ukraine to the Middle East, GPS Disruption Drives Demand for Next-Generation Defense Technology »

Previous « Survey Says AI Has Replaced 20% of Full-Time Roles in the US

Published by

AINewsWire

2 months ago

Pope Leo Cautions on Threat to Humanity Posed by AI
Pope Leo XIV has turned attention to the growing influence of AI, warning that rapid technological…
VISA Warns That AI is Accelerating Scams
AI has undeniably changed daily life, helping people work faster and automate routine tasks. Yet alongside those benefits,…
Safe Pro Group Inc. (NASDAQ: SPAI) Invited to Train Soldiers on Battlefield Threat Detection System During an Upcoming U.S. Army Force-on-Force (‘FoF’) Combat Training Exercise
Safe Pro Group Has Been Invited to a U.S. Army Force-on-Force (“FoF”) Combat Exercise designed…

AI’s Power Crisis Is Accelerating a Potential $2.5 Trillion Hydrogen Market

This article has been disseminated on behalf of MAX Power Mining Corp. and may include…

7 hours ago

Brand Awareness

America’s Housing Crisis Fuels Demand for Affordable, Factory-Built Home Innovation

AINewsWire Editorial Coverage: The United States housing market is facing a growing affordability and supply…

7 hours ago

AINewsWire

Pope Leo Cautions on Threat to Humanity Posed by AI

Pope Leo XIV has turned attention to the growing influence of AI, warning that rapid technological…

3 days ago

AINewsWire

VISA Warns That AI is Accelerating Scams

AI has undeniably changed daily life, helping people work faster and automate routine tasks. Yet alongside those benefits,…

4 days ago

AINewsWire

Safe Pro Group Inc. (NASDAQ: SPAI) Invited to Train Soldiers on Battlefield Threat Detection System During an Upcoming U.S. Army Force-on-Force (‘FoF’) Combat Training Exercise

Safe Pro Group Has Been Invited to a U.S. Army Force-on-Force (“FoF”) Combat Exercise designed…

4 days ago

AINewsWire

Safe Pro Group Inc. (NASDAQ: SPAI) Experiences Rapid, High Margin Revenue Increase and Launches a New Growth Team

AI-powered security and defense solutions company Safe Pro Group recently released its financial results for…

6 days ago

Study Finds AI Medical Diagnosis Errors Exceed 80%

Related Post

Recent Posts

AI’s Power Crisis Is Accelerating a Potential $2.5 Trillion Hydrogen Market

America’s Housing Crisis Fuels Demand for Affordable, Factory-Built Home Innovation

Pope Leo Cautions on Threat to Humanity Posed by AI

VISA Warns That AI is Accelerating Scams

Safe Pro Group Inc. (NASDAQ: SPAI) Invited to Train Soldiers on Battlefield Threat Detection System During an Upcoming U.S. Army Force-on-Force (‘FoF’) Combat Training Exercise

Safe Pro Group Inc. (NASDAQ: SPAI) Experiences Rapid, High Margin Revenue Increase and Launches a New Growth Team