Skip to content

AI Made the Right Diagnosis in 67% of Emergency Cases - the Doctors Were at 55% and 50%

1 min read
Share

A study by Harvard and Beth Israel hospital, published in the journal Science, has shown that OpenAI's o1 model produced an accurate or close-to-accurate diagnosis in 67 percent of emergency patients at initial triage - compared with 55 percent and 50 percent for the two doctors it was tested against. The study covers 76 emergency cases, with the AI system receiving the same raw data input as the doctors.

The result is significant - but the researchers themselves moderate the expectations. The comparison was with internal medicine specialists, not emergency physicians. And Adam Rodman from Beth Israel, one of the authors, openly admits: "There is currently no formal mechanism for accountability when an AI makes the wrong diagnosis." That is a question the system neither asks nor answers.

Some doctors reacted with scepticism almost immediately. Kristen Panthagani, an emergency physician, called the headlines around the study "hyped up" - an interesting piece of research, but not evidence for clinical use. The researchers themselves confirm that real-world prospective trials are needed before any actual use in hospitals.

The practical reality is this: AI reads text fast and well. That is useful in emergency conditions where seconds count. But diagnosis is not just text - it is physical examination, nuances in the patient's voice, context that doesn't fit into a data field. Will these models be in hospitals in the Balkans soon? Probably not - but that they will be there in ten years is close to certain.