Abstract

Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.

Highlights

  • Biomedical literature has accumulated a huge and ever-increasing amount of biomedical evidence

  • We propose a novel technique PBC (Passage-based Bibliographic Coupling), which seamlessly integrates bibliographic coupling with the information collected from context passages of important out-link citations in each article

  • PBC performs significantly better than bibliographic coupling similarity (BC) in Mean Average Precision (MAP) under all different settings for α⊰{5, 10, 15, 20}, it tends to perform better when α is between 10 and 20, which are close to the number of words that authors of biomedical articles employ to comment why a citation is discussed in the articles

Read more

Summary

Introduction

Biomedical literature has accumulated a huge and ever-increasing amount of biomedical evidence. Given a specific issue (e.g. association among specific genes, diseases, chemicals, and proteins), the researchers need to carefully read multiple articles to exclude controversial evidence about specific issues. To maintain a database of gene-disease associations, Genetic Home Reference (GHR) recruits hundreds of curators that carefully check multiple articles [1] and routinely update the database in each week [2]. One way to retrieve the articles is to set a query about specific biomedical entities (e.g., genes and diseases) and search for those articles that are related to the query [5]. Another way is to PLOS ONE | DOI:10.1371/journal.pone.0139245. Another way is to PLOS ONE | DOI:10.1371/journal.pone.0139245 October 6, 2015

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.