Abstract
In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications.
Highlights
Scientific literature is the primary source for scholars to communicate with others as well as the public
Hepatocellular carcinoma is one common type of liver cancer caused by cirrhosis in most cases
Compare to PageRank, betweenness centrality includes more specific terms and terms that may not be associated with liver cancer such as thyrotoxicosis, mitochondrial dysfunction, and HPV
Summary
Scientific literature is the primary source for scholars to communicate with others as well as the public. As online accessibility to scholarly literature is enhanced, the growth rate of scholarly literature is unprecedentedly high. A linear growth of publications has been reported for fields such as bioinformatics [1]. A concern as a result of such proliferations is the lagged consumption of scientific literature. To alleviate this tension, scholars have attempted to apply a variety of text mining techniques, such as information extraction [2], topic modeling [3], and document summarization [4], to systematically distill knowledge from large scientific literature corpora
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have