Abstract

The placenta is a maternal-fetal organ that develops during pregnancy and provides nutrients, oxygen, and removal of waste products to the growing fetus. Better understanding of the placenta promises to help improve health of mothers and children, given its influence on health lasting a lifetime. However, the placenta is poorly understood due to its variability across different species and no live functions available after the baby is delivered. The Placenta Atlas Tool (PAT) project aims to leverage advanced computational approaches to meld numerous and diverse datasets into an integrated resource to encourage a "systems biology" approach for study of both normal and abnormal placental development and function throughout gestation. In this study, we introduced a multi-layer framework to automatically identify PAT relevant research from PubMed and develop a Placenta Curated Research Dataset (PCRD) to ultimately support placenta research. This framework functions by multiple well-known Natural Language Processing (NLP) components; including Medical Subject Headings (MeSH) based Naïve Bayes classifier, abstract based text similarity comparison and MeSH based article prioritization to systematically filter out PAT relevant research publications for further data curation. In addition, we developed a user-friendly web application to incorporate human judgement at the final stage of publication identification. We obtained 22,047 articles from PubMed, and programmatically identified 6086 articles that are highly relevant to PAT via our framework. To assess performance of the framework, we manually reviewed a random set of articles by using our web tool. Based on our review, accuracy of article classification is greater than 90% and accuracy of prioritization is greater than 80%. We developed a multi-layer publication identification framework to systematically identify PAT relevant publications from PubMed. This framework not only demonstrates good performance in identifying placenta related research, but also can be easily extended to support research in other scientific fields.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call