A Multichannel Biomedical Named Entity Recognition Model Based on Multitask Learning and Contextualized Word Representations

Hao Wei,Mingyuan Gao,Mingyu Lu,Ai Zhou,Yijia Zhang,Fei Chen,Wen Qu

doi:10.1155/2020/8894760

Abstract

As the biomedical literature increases exponentially, biomedical named entity recognition (BNER) has become an important task in biomedical information extraction. In the previous studies based on deep learning, pretrained word embedding becomes an indispensable part of the neural network models, effectively improving their performance. However, the biomedical literature typically contains numerous polysemous and ambiguous words. Using fixed pretrained word representations is not appropriate. Therefore, this paper adopts the pretrained embeddings from language models (ELMo) to generate dynamic word embeddings according to context. In addition, in order to avoid the problem of insufficient training data in specific fields and introduce richer input representations, we propose a multitask learning multichannel bidirectional gated recurrent unit (BiGRU) model. Multiple feature representations (e.g., word-level, contextualized word-level, character-level) are, respectively, or collectively fed into the different channels. Manual participation and feature engineering can be avoided through automatic capturing features in BiGRU. In merge layer, multiple methods are designed to integrate the outputs of multichannel BiGRU. We combine BiGRU with the conditional random field (CRF) to address labels’ dependence in sequence labeling. Moreover, we introduce the auxiliary corpora with same entity types for the main corpora to be evaluated in multitask learning framework, then train our model on these separate corpora and share parameters with each other. Our model obtains promising results on the JNLPBA and NCBI-disease corpora, with F1-scores of 76.0% and 88.7%, respectively. The latter achieves the best performance among reported existing feature-based models.

Highlights

Named entity recognition (NER) aims to identify and extract specific entities from massive unstructured text data, which becomes a primary task for information extraction, text analysis, text mining, etc
Conventional machine learning methods were widely used for NER, such as support vector machine (SVM), conditional random field (CRF), and maximum entropy model (MEM)
The experiment compares the performance of multichannel bidirectional gated recurrent unit (BiGRU) with some existing feature-based methods in Biomedical named entity recognition (BNER)

Summary

Introduction

Named entity recognition (NER) aims to identify and extract specific entities (persons, places, organizations, and so on) from massive unstructured text data, which becomes a primary task for information extraction, text analysis, text mining, etc. How to effectively extract and obtain valuable information has become a serious challenge for researchers in the biomedical field. Biomedical named entity recognition (BNER) is an indispensable step for this above challenge. Conventional machine learning methods were widely used for NER, such as support vector machine (SVM), conditional random field (CRF), and maximum entropy model (MEM). Finkel et al [1] combined distant resources and additional features to identify the biomedical entities. Liao et al [5] adopted the Skip-Chain CRF model to recognize entities, which effectively captured the features of the distant context. Feature engineering is an essential element of the conventional machine learning methods.

Objectives

Methods

Results

Conclusion