Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion.

Jitendra Jonnagaddala,Toni Rose Jue,Hong-Jie Dai,Nai-Wen Chang

doi:10.1093/database/baw112

Jitendra Jonnagaddala, Toni Rose Jue + Show 2 more

Open Access

https://doi.org/10.1093/database/baw112

Copy DOI

Journal: Database	Publication Date: Jan 1, 2016
Citations: 17	License type: cc-by

Affiliation: UNSW Sydney, National Taitung University

Abstract

The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed methods, conditional random fields (CRFs) and dictionary lookup method are widely used for named entity recognition and normalization respectively. We herein developed a CRF-based model to allow automated recognition of disease mentions, and studied the effect of various techniques in improving the normalization results based on the dictionary lookup approach. The dataset from the BioCreative V CDR track was used to report the performance of the developed normalization methods and compare with other existing dictionary lookup based normalization methods. The best configuration achieved an F-measure of 0.77 for the disease normalization, which outperformed the best dictionary lookup based baseline method studied in this work by an F-measure of 0.13.Database URL: https://github.com/TCRNBioinformatics/DiseaseExtract

Highlights

The importance of extracting disease related information mapped to a standardized vocabulary is increasing with the yearly increase of published biomedical literature [1]
Medical subject headings (MeSH) terminology was developed by the National Library of Medicine to speed up and increase the precision of biomedical literature retrieval [4]
After comparing to other similar dictionary-based methods, our results suggest that, with the right combination of additional techniques we can significantly improve the performance of the dictionary lookup based disease name normalization (DNORM)

Summary

Introduction

The importance of extracting disease related information mapped to a standardized vocabulary is increasing with the yearly increase of published biomedical literature [1]. It is revealed that in 2011, over 20 million documents were available in PubMed alone with an average of 4% increase per year with keywords relating to diseases being the second most common user search query [1]. A PubMed query using the keywords ‘disease OR diseases OR disorder OR disorders’ in early 2016 resulted in over 6.5 million documents revealing an average of 6% yearly increase from 2000 to 2014 (Figure 1). Comparable trends can be observed in specific disease categories such as cancer and cardio vascular diseases. Because of this increase in available literature, researchers are faced with the challenge of identifying biomedical documents relevant to them [2,3]. Text mining techniques can be employed to assist in overcoming these challenges [5]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database

Lead the way for us

Similar Papers

A Hybrid Approach for French Medical Entity Recognition and Normalization
Allaouzi Imane ... Mohamed Ben Ahmed
-
Allaouzi Imane, et. al.Allaouzi Imane ... Mohamed Ben Ahmed
01 Jan 2018
01 Jan 2018

CDRnN: A high performance chemical-disease recognizer in biomedical literature
Hsin-Chun Lee ... Hung-Yu Kao
-
Hsin-Chun Lee, et. al.Hsin-Chun Lee ... Hung-Yu Kao
01 Nov 2017
01 Nov 2017

Challenges in clinical natural language processing for automated disorder normalization
Robert Leaman ... Zhiyong Lu
Journal of Biomedical Informatics | VOL. 57
Robert Leaman, et. al.Robert Leaman ... Zhiyong Lu
14 Jul 2015
Journal of Biomedical Informatics | VOL. 57

TaggerOne: joint named entity recognition and normalization with semi-Markov Models.
Robert Leaman ... Zhiyong Lu
Bioinformatics | VOL. 32
Robert Leaman, et. al.Robert Leaman ... Zhiyong Lu
09 Jun 2016
Bioinformatics | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database