Artificial Intelligence Matches or Outperforms Human Specialists in Retina and Glaucoma Management

Key Takeaways

  • An artificial intelligence (AI) system can match, or in some cases, outperform, human ophthalmologists in the diagnosis and treatment of patients with glaucoma and retinal disease.
  • The findings suggest that advanced AI tools could play an important role in providing decision-making support to ophthalmologists in the diagnosis and management of cases involving glaucoma and retinal disorders.

A large language model (LLM) AI system can match, or in some cases outperform, human ophthalmologists in the diagnosis and treatment of patients with glaucoma and retina disease, according to research from New York Eye and Ear Infirmary of Mount Sinai (NYEE). Large language models analyze vast arrays of text to learn how likely words are to occur next to each other. The provocative study, published February 22 in JAMA Ophthalmology, suggests that advanced AI tools could play an important role in providing decision-making support to ophthalmologists in the diagnosis and management of cases involving glaucoma and retina disorders, which afflict millions of patients.

The study matched the knowledge of 12 ophthalmic specialists against the capabilities of the latest generation ChatGPT system, GPT-4 (Generative Pre-Training-Model 4) from OpenAI, designed to replicate human-level performance. A basic set of 20 questions (10 each for glaucoma and retina) from the American Academy of Ophthalmology’s list of commonly asked questions by patients was randomly selected, along with 20 deidentified patient cases culled from Mount Sinai-affiliated eye clinics. Responses from both the GPT-4/AI system and human specialists were then statistically analyzed and rated for accuracy and thoroughness using a Likert scale, which is commonly used in clinical research to score responses.

The results showed that AI matched or outperformed human specialists in both accuracy and completeness of its medical advice and assessments. More specifically, AI demonstrated superior performance in response to glaucoma questions and case-management advice, while reflecting a more balanced outcome in retina questions, where AI matched humans in accuracy but exceeded them in completeness.

“The performance of GPT-4 in our study was quite eye-opening,” says Andy Huang, MD, an ophthalmology resident at NYEE, and lead author of the study. Dr. Huang told MedPage Today that he had expected that the chatbot would do worse, as was the case in a 2023 study in which a chatbot bungled almost all the answers and even offered harmful advice, “but there’s no place where people did better.” While emphasizing that additional testing is needed, Dr. Huang believes this work points to a promising future for AI in ophthalmology. “It could serve as a reliable assistant to eye specialists by providing diagnostic support and potentially easing their workload, especially in complex cases or areas of high patient volume,” he explains. “For patients, the integration of AI into mainstream ophthalmic practice could result in quicker access to expert advice, coupled with more informed decision-making to guide their treatment.”

Sources:

The Mount Sinai Hospital / Mount Sinai School of Medicine, ScienceDaily, February 22, 2024; see ScienceDaily article Randy Dotinga, “’Eye Opening’: chatbot outperforms ophthalmologists.” MedPage Today, February 23, 2024; see MedPage Today article