Question Answering for Electronic Health Records: Scoping Review of Datasets and Models.

Jayetri Bardhan,Kirk Roberts,Daisy Zhe Wang

doi:10.2196/53636

Abstract

Question answering (QA) systems for patient-related data can assist both clinicians and patients. They can, for example, assist clinicians in decision-making and enable patients to have a better understanding of their medical history. Substantial amounts of patient data are stored in electronic health records (EHRs), making EHR QA an important research area. Because of the differences in data format and modality, this differs greatly from other medical QA tasks that use medical websites or scientific papers to retrieve answers, making it critical to research EHR QA. This study aims to provide a methodological review of existing works on QA for EHRs. The objectives of this study were to identify the existing EHR QA datasets and analyze them, study the state-of-the-art methodologies used in this task, compare the different evaluation metrics used by these state-of-the-art models, and finally elicit the various challenges and the ongoing issues in EHR QA. We searched for articles from January 1, 2005, to September 30, 2023, in 4 digital sources, including Google Scholar, ACL Anthology, ACM Digital Library, and PubMed, to collect relevant publications on EHR QA. Our systematic screening process followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A total of 4111 papers were identified for our study, and after screening based on our inclusion criteria, we obtained 47 papers for further study. The selected studies were then classified into 2 non-mutually exclusive categories depending on their scope: "EHR QA datasets" and "EHR QA models." A systematic screening process obtained 47 papers on EHR QA for final review. Out of the 47 papers, 53% (n=25) were about EHR QA datasets, and 79% (n=37) papers were about EHR QA models. It was observed that QA on EHRs is relatively new and unexplored. Most of the works are fairly recent. In addition, it was observed that emrQA is by far the most popular EHR QA dataset, both in terms of citations and usage in other papers. We have classified the EHR QA datasets based on their modality, and we have inferred that Medical Information Mart for Intensive Care (MIMIC-III) and the National Natural Language Processing Clinical Challenges datasets (ie, n2c2 datasets) are the most popular EHR databases and corpuses used in EHR QA. Furthermore, we identified the different models used in EHR QA along with the evaluation metrics used for these models. EHR QA research faces multiple challenges, such as the limited availability of clinical annotations, concept normalization in EHR QA, and challenges faced in generating realistic EHR QA datasets. There are still many gaps in research that motivate further work. This study will assist future researchers in focusing on areas of EHR QA that have possible future research directions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Question Answering for Electronic Health Records: Scoping Review of Datasets and Models.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research

Lead the way for us

Journal: Journal of medical Internet research	Publication Date: Oct 30, 2024
License type: cc-by

Similar Papers

Question answering systems for health professionals at the point of care-a systematic review.
Gregory Kell ... Iain J Marshall
Journal of the American Medical Informatics Association : JAMIA | VOL. 31
Gregory Kell, et. al.Gregory Kell ... Iain J Marshall
16 Feb 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. 31

Exploring the relationship between electronic health records and provider burnout: A systematic review.
Qi Yan ... Preston H Tolbert
Journal of the American Medical Informatics Association | VOL. 28
Qi Yan, et. al.Qi Yan ... Preston H Tolbert
28 Feb 2021
Journal of the American Medical Informatics Association | VOL. 28

The assessment of data quality issues for process mining in healthcare using Medical Information Mart for Intensive Care III, a freely available e-health record database.
Angelina Prima Kurniati ... Owen A Johnson
Health Informatics Journal | VOL. 25
Angelina Prima Kurniati, et. al.Angelina Prima Kurniati ... Owen A Johnson
29 Nov 2018
Health Informatics Journal | VOL. 25

Identifying stroke-related quantified evidence from electronic health records in real-world studies
Lin Yang ... Jiao Li
Artificial Intelligence in Medicine | VOL. 140
Lin Yang, et. al.Lin Yang ... Jiao Li
23 Apr 2023
Artificial Intelligence in Medicine | VOL. 140

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Question Answering for Electronic Health Records: Scoping Review of Datasets and Models.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research