Named Entity Recognition with Bidirectional LSTM-CNNs

Jason P.C Chiu,Eric Nichols

doi:10.1162/tacl_a_00104

Jason P.C Chiu, Eric Nichols

Open Access

PDF Available

https://doi.org/10.1162/tacl_a_00104

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance. In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. We also propose a novel method of encoding partial lexicon matches in neural networks and compare it to existing approaches. Extensive evaluation shows that, given only tokenized text and publicly available word embeddings, our system is competitive on the CoNLL-2003 dataset and surpasses the previously reported state of the art performance on the OntoNotes 5.0 dataset by 2.13 F1 points. By using two lexicons constructed from publicly-available sources, we establish new state of the art performance with an F1 score of 91.62 on CoNLL-2003 and 86.28 on OntoNotes, surpassing systems that employ heavy feature engineering, proprietary lexicons, and rich entity linking information.

Highlights

Named entity recognition is an important task in NLP
Collobert et al (2011b) proposed an effective neural network model that requires little feature engineering and instead learns important features from word embeddings trained on large quantities of unlabelled text – an approach made possible by recent advancements in unsupervised learning of word embeddings on massive amounts of data (Collobert and Weston, 2008; Mikolov et al, 2013) and neural network training algorithms permitting deep architectures (Rumelhart et al, 1986)
With no external knowledge other than word embeddings, our model is competitive on the CoNLL2003 dataset and establishes a new state of the art for OntoNotes, suggesting that given enough data, the neural network automatically learns the relevant features for NER without feature engineering

Summary

Introduction

Named entity recognition is an important task in NLP. High performance approaches have been dominated by applying CRF, SVM, or perceptron models to hand-crafted features (Ratinov and Roth, 2009; Passos et al, 2014; Luo et al, 2015). For sequential labelling tasks such as NER and speech recognition, a bi-directional LSTM model can take into account an effectively infinite amount of context on both sides of a word and eliminates the problem of limited context that applies to any feed-forward model (Graves et al, 2013). Convolutional neural networks (CNN) have been investigated for modeling character-level information, among other NLP tasks. We present a hybrid model of bi-directional LSTMs and CNNs that learns both character- and word-level features, presenting the first evaluation of such an architecture on well-established English language evaluation datasets. To induce character-level features, we use a convolutional neural network, which has been successfully applied to Spanish and Portuguese NER (Santos et al, 2015) and German POS-tagging (Labeau et al, 2015)

Sequence-labelling with BLSTM

Extracting Character Features Using a Convolutional Neural Network

Word Embeddings

Character Embeddings

Capitalization Feature

Lexicons

Implementation

Objective Function and Inference

Tagging Scheme

Learning Algorithm

Evaluation

Dataset Preprocessing

CoNLL 2003 Dataset

Hyper-parameter Optimization

Training and Tagging Speed

Excluding Failed Trials

Results and Discussion

Comparison with FFNNs

Effect of Dropout

Lexicon Features

Analysis of OntoNotes Performance

Named Entity Recognition

NER with Neural Networks

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Dec 1, 2016
Citations: 1620	License type: cc-by

R Discovery Prime

Named Entity Recognition with Bidirectional LSTM-CNNs

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

Chinese Named Entity Recognition Based on B-LSTM Neural Network with Additional Features
Liubo Ouyang ... Yuan Tian
-
Liubo Ouyang, et. al.Liubo Ouyang ... Yuan Tian
01 Jan 2017
01 Jan 2017

GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.
Qile Zhu ... Cécile Pereira
Bioinformatics | VOL. 34
Qile Zhu, et. al.Qile Zhu ... Cécile Pereira
20 Dec 2017
Bioinformatics | VOL. 34

Long short-term memory RNN for biomedical named entity recognition
Chen Lyu ... Donghong Ji
BMC Bioinformatics | VOL. 18
Chen Lyu, et. al.Chen Lyu ... Donghong Ji
30 Oct 2017
BMC Bioinformatics | VOL. 18

Neural Networks for Featureless Named Entity Recognition in Czech
Jana Straková ... Milan Straka
-
Jana Straková, et. al.Jana Straková ... Milan Straka
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Named Entity Recognition with Bidirectional LSTM-CNNs

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics