Research Alert

 Newswise — A Yale School of Medicine team has created a local large language model (LLM) pipeline that can accurately detect recurrent gastrointestinal bleeding (GIB) using electronic health records.

Researchers say early identification of recurrent GIB may lead to higher quality care and improve patient outcomes. Robust identification of GIB in the electronic health record can also lead to more accurate risk-adjusted coding of diagnoses so that healthcare providers receive appropriate reimbursement for high-complexity patients from insurers.

"We really want to know who the high-risk patients are who might who bleed again," said Dennis L. Shung, MD, Assistant Professor in the Department of Internal Medicine and Department of Biomedical Informatics and Data Science who was the senior author of the study. "We want to make sure they receive the right care at the right time, but it's very difficult and time-intensive to detect recurrent bleeding based on manual chart review."

The Yale team trained and validated its model using electronic health records from 546 patients who were hospitalized with acute GIB and underwent upper endoscopy from 2 hospitals in the Yale-New Haven Health System. When they applied the pipeline to a different group of 562 patients who had undergone upper endoscopy from 4 other hospitals in the Yale-New Haven Health System, researchers say the LLM was able to accurately identify recurrent GIB 97% of the time compared to 0% using MetaMap, a publicly-available NLP tool developed by the National Library of Medicine.

The team says its tool would allow providers to monitor patients more easily after procedures, alerting doctors when a particular patient needs further observation or intervention. "Every single day we would monitor patients who have undergone upper endoscopy for acute GIB and make sure you're not meeting criteria for recurrent bleeding," Shung says. "And then once it happens, then we could say, this patient needs to be back on our radar immediately."

Researchers say the more accurate coding enabled by the algorithm would have resulted in $1,299 to $3,247 greater potential reimbursement per patient from insurers. "It was really exciting that a large language model based pipeline could not only identify a very complex, clinically relevant quality outcome, which is rewarding," Shung said, "but also identify areas where the health system could get reimbursed for the highly complex level of care they provide."

Other study authors included Neil S. Zheng, Vipina K. Keloth, Daniel Kats, Darrick K. Lee, Hamita Sachar, Hua Xu, and Loren Laine.

Journal Link: Gastroenterology, Sept-2024