Abstract

In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications.

Highlights

  • Scientific literature is the primary source for scholars to communicate with others as well as the public

  • Hepatocellular carcinoma is one common type of liver cancer caused by cirrhosis in most cases

  • Compare to PageRank, betweenness centrality includes more specific terms and terms that may not be associated with liver cancer such as thyrotoxicosis, mitochondrial dysfunction, and HPV

Read more

Summary

Introduction

Scientific literature is the primary source for scholars to communicate with others as well as the public. As online accessibility to scholarly literature is enhanced, the growth rate of scholarly literature is unprecedentedly high. A linear growth of publications has been reported for fields such as bioinformatics [1]. A concern as a result of such proliferations is the lagged consumption of scientific literature. To alleviate this tension, scholars have attempted to apply a variety of text mining techniques, such as information extraction [2], topic modeling [3], and document summarization [4], to systematically distill knowledge from large scientific literature corpora

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call