Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

Kenli Li,Fan Zhang,Keqin Li,Kai Hwang,Zhuo Tang,Wei Ai,Lingang Jiang

doi:10.1109/tpds.2014.2368568

Abstract

Processing large volumes of data has presented a challenging issue, particularly in data-redundant systems. As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER). Due to the internally sequential feature, performance improvement of the CRF model is nontrivial, which requires new parallelized solutions. By combining and parallelizing the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and Viterbi algorithms, we propose a parallel CRF algorithm called MapReduce CRF (MRCRF) in this paper, which contains two parallel sub-algorithms to handle two time-consuming steps of the CRF model. The MapReduce L-BFGS (MRLB) algorithm leverages the MapReduce framework to enhance the capability of estimating parameters. Furthermore, the MapReduce Viterbi (MRVtb) algorithm infers the most likely state sequence by extending the Viterbi algorithm with another MapReduce job. Experimental results show that the MRCRF algorithm outperforms other competing methods by exhibiting significant performance improvement in terms of time efficiency as well as preserving a guaranteed level of correctness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Nov 1, 2015
Citations: 61

Similar Papers

AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
Xinyu Wang ... Tao Wang
-
Xinyu Wang, et. al.Xinyu Wang ... Tao Wang
01 Jan 2020
01 Jan 2020

Decision letter: Graphical-model framework for automated annotation of cell identities in dense cellular images
Ronald L Calabrese
-
Ronald L CalabreseRonald L Calabrese
24 Aug 2020
24 Aug 2020

Two-phase biomedical named entity recognition using CRFs
Lishuang Li ... Degen Huang
Computational Biology and Chemistry | VOL. 33
Lishuang Li, et. al.Lishuang Li ... Degen Huang
11 Jul 2009
Computational Biology and Chemistry | VOL. 33

The application effect of the Rasch measurement model combined with the CRF model: An analysis based on English discourse.
Yunxia Wang
PloS one | VOL. 19
Yunxia WangYunxia Wang
01 Jan 2024
PloS one | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems