Abstract
In Information Retrieval (IR), the semantic gap represents the mismatch between users’ queries and how retrieval models answer to these queries. In this paper, we explore how to use external knowledge resources to enhance bag-of-words representations and reduce the effect of the semantic gap between queries and documents. In this regard, we propose several simple but effective knowledge-based query expansion and reduction techniques, and we evaluate them for the medical domain. The query reformulations proposed are used to increase the probability of retrieving relevant documents through the addition to, or the removal from, the original query of highly specific terms. The experimental analyses on different test collections for Precision Medicine IR show the effectiveness of the developed techniques. In particular, a specific subset of query reformulations allow retrieval models to achieve top performing results in all the considered test collections.
Highlights
We added for each measure the lowest score required to enter the top 10 Text REtrieval Conference (TREC) results list, and the score obtained by our best combination
Compare them with the results obtained by the research groups that participated at TREC Precision Medicine (PM) 2017, 2018, and 2019 (Part B)
We have investigated how to use external knowledge resources to enhance bag-of-words representations and reduce the effect of the semantic gap between queries and documents, focusing on an important use-case in Clinical Decision Support (CDS): providing relevant information to clinicians treating cancer patients
Summary
A patient with a recently diagnosed condition would generally benefit from introductory information about the treatment of the disease, a trained physician would instead expect more detailed information when deciding the course of treatment, and a medical researcher would require references and literature relevant to their future research articles [1]. This diversity of users with different information needs and medical expertise represents a central issue in medical Information.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have