Biomedical named entity recognition based on Glove-BLSTM-CRF model

Gelin Ning,Yunli Bai

doi:10.3233/jcm-204419

Abstract

Named entity recognition is a fundamental task of natural language processing. The number of biomedical named entities is huge, the naming rules are not uniform, and the entity word formation is complex, which brings great difficulties to the biomedical named entity recognition. Traditional machine learning algorithms rely heavily on manual extraction of features. The quality of feature extraction directly affects the accuracy of entity recognition. In the biomedical domain, the cost of manually extracting features and annotating data sets is enormous. In recent years, deep learning methods that do not rely on artificial features have made great progress in many domains. This paper proposes a model based on Glove-BLSTM-CRF for biomedical named entity recognition. Firstly, the Glove model is used to train word vector with semantic features, and BLSTM is used to train word vector with character morphological features. The two are combined as the final representation of the word, then input into the BLSTM-CRF deep learning model to recognize the entity categories. The experimental results show that the model has achieved a better result in the JNLPBA 2004 biomedical named entity recognition task without relying on any artificial features and rules, and the F1 value reaches 75.62%.

Full Text