Abstract

Medical question answering (QA) systems have the potential to answer clinicians' uncertainties about treatment and diagnosis on-demand, informed by the latest evidence. However, despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments. One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, trustworthiness, and provenance have not been key considerations in the design of such models. In this paper we discuss a set of criteria that, if met, we argue would likely increase the utility of biomedical QA systems, which may in turn lead to adoption of such systems in practice. We assess existing models, tasks, and datasets with respect to these criteria, highlighting shortcomings of previously proposed approaches and pointing toward what might be more usable QA systems.

Highlights

  • 1 Introduction use of rigorous empirical evidence is known as evidence-based medicine (EBM)

  • We argue that the deployment of EBM-guided question answering (QA) systems—by which we mean those intended to provide answers to clinical questions based on published evidence—in clinical practices is contingent on the outputs being reliable and actionable

  • In accordance with these criteria, we suggest the following questions to assess the transparency of QA systems: 1. Do the answers come from reliable sources for health information? All research articles are not equal, and there exist mature approaches to help clinicians identify the 30 most reliable advice from the health literature

Read more

Summary

Introduction

1 Introduction use of rigorous empirical evidence is known as evidence-based medicine (EBM). Existing biomedical QA systems that answer questions with reference to the medical literature typically provide answers in the form of yes/no, factoids, lists, and/or definitions (Sarrouti and Ouatik El Alaoui, 2020; Ben Abacha and Zweigenbaum, 2015; Cao et al, 2011; Zahid et al, 2018; Yu et al, 2007) without supplying justifications, e.g., source journals, extracted text snippets, and/or associated statistics. QA systems which take a naive approach to evidence extraction—for example, selecting an answer from an undifferentiated corpus of scientific literature, treating all studies as reliable—are likely to be considerably less useful to clinicians.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.