Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents

Sun Kim,Nicolas Fiorini,W John Wilbur,Zhiyong Lu

doi:10.1016/j.jbi.2017.09.014

Sun Kim, Nicolas Fiorini + Show 2 more

Open Access

https://doi.org/10.1016/j.jbi.2017.09.014

Copy DOI

Abstract

The main approach of traditional information retrieval (IR) is to examine how many words from a query appear in a document. A drawback of this approach, however, is that it may fail to detect relevant documents where no or only few words from a query are found. The semantic analysis methods such as LSA (latent semantic analysis) and LDA (latent Dirichlet allocation) have been proposed to address the issue, but their performance is not superior compared to common IR approaches. Here we present a query-document similarity measure motivated by the Word Mover’s Distance. Unlike other similarity measures, the proposed method relies on neural word embeddings to compute the distance between words. This process helps identify related words when no direct matches are found between a query and a document. Our method is efficient and straightforward to implement. The experimental results on TREC Genomics data show that our approach outperforms the BM25 ranking function by an average of 12% in mean average precision. Furthermore, for a real-world dataset collected from the PubMed® search logs, we combine the semantic measure with BM25 using a learning to rank method, which leads to improved ranking scores by up to 25%. This experiment demonstrates that the proposed approach and BM25 nicely complement each other and together produce superior performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Biomedical Informatics	Publication Date: Oct 3, 2017
Citations: 77	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics

Lead the way for us

Similar Papers

Semantic Annotation for Context-Aware Information Retrieval for Supporting the Environmental Review of Transportation Projects
Xuan Lv ... Nora M El-Gohary
-
Xuan Lv, et. al.Xuan Lv ... Nora M El-Gohary
16 Jun 2015
16 Jun 2015

An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering
Junaid Rashid ... Aun Irtaza
Mehran University Research Journal of Engineering and Technology | VOL. 39
Junaid Rashid, et. al.Junaid Rashid ... Aun Irtaza
01 Jan 2020
Mehran University Research Journal of Engineering and Technology | VOL. 39

A comparative analysis of Latent Semantic analysis and Latent Dirichlet allocation topic modeling methods using Bible data
Vasantha Kumari Garbhapu
Indian Journal of Science and Technology | VOL. 13
Vasantha Kumari GarbhapuVasantha Kumari Garbhapu
20 Nov 2020
Indian Journal of Science and Technology | VOL. 13

Deep Is Better? An Empirical Comparison of Information Retrieval and Deep Learning Approaches to Code Summarization
Tingwei Zhu ... Tian Zhang
ACM Transactions on Software Engineering and Methodology | VOL. 33
Tingwei Zhu, et. al.Tingwei Zhu ... Tian Zhang
15 Mar 2024
ACM Transactions on Software Engineering and Methodology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics