Abstract

BackgroundValuable scientific results on biomedicine are very rich, but they are widely scattered in the literature. Topic modeling enables researchers to discover themes from an unstructured collection of documents without any prior annotations or labels. In this paper, taking ginseng as an example, biological dynamic topic model (Bio-DTM) was proposed to conduct a retrospective study and interpret the temporal evolution of the research of ginseng.MethodsThe system of Bio-DTM mainly includes four components, documents pre-processing, bio-dictionary construction, dynamic topic models, topics analysis and visualization. Scientific articles pertaining to ginseng were retrieved through text mining from PubMed. The bio-dictionary integrates MedTerms medical dictionary, the second edition of side effect resource, a dictionary of biology and HGNC database of human gene names (HGNC). A dynamic topic model, a text mining technique, was used to emphasize on capturing the development trends of topics in a sequentially collected documents. Besides the contents of topics taken on, the evolution of topics was visualized over time using ThemeRiver.ResultsFrom the topic 9, ginseng was used in dietary supplements and complementary and integrative health practices, and became very popular since the early twentieth century. Topic 6 reminded that the planting of ginseng is a major area of research and symbiosis and allelopathy of ginseng became a research hotspot in 2007. In addition, the Bio-DTM model gave an insight into the main pharmacologic effects of ginseng, such as anti-metabolic disorder effect, cardioprotective effect, anti-cancer effect, hepatoprotective effect, anti-thrombotic effect and neuroprotective effect.ConclusionThe Bio-DTM model not only discovers what ginseng’s research involving in but also displays how these topics evolving over time. This approach can be applied to the biomedical field to conduct a retrospective study and guide future studies.

Highlights

  • Valuable scientific results on biomedicine are very rich, but they are widely scattered in the literature

  • System overview In this paper, a Bio-dynamic topic model (DTM) was established by combining DTM and a bio-dictionary to help researchers mine the related topics how developing over time in a large amount of biomedical documents

  • In this paper, we developed a biological dynamic topic model (Bio-DTM) model by extending the DTM model to reveal topics and their evolution in biomedical literature

Read more

Summary

Introduction

Valuable scientific results on biomedicine are very rich, but they are widely scattered in the literature. Topic modeling enables researchers to discover themes from an unstructured collection of documents without any prior annotations or labels. The scientific study of biomedicine is very active but all of the profound results are widely scattered in the literature. It enables researchers to discover the themes from an unstructured collection of documents without any prior annotations or labels. David et al developed a correlated topic model (CTM) remedying the limitation of LDA without the ability to model topic correlation [15]. When applied to the articles from Science published from 1990 to 1999, the CTM obtained a better fit of the large document collections than LDA [16]. Topic models and lda are two R packages for fitting topic models, but they employ different estimation techniques [18, 19]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.