Abstract

In this paper, we study a novel approach for named entity recognition (NER) and mention detection (MD) in natural language processing. Instead of treating NER as a sequence labeling problem, we propose a new local detection approach, which relies on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Subsequently, a simple feedforward neural network (FFNN) is learned to either reject or predict entity label for each individual text fragment. The proposed method has been evaluated in several popular NER and MD tasks, including CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our method has yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labeling methods.

Highlights

  • Natural language processing (NLP) plays an important role in artificial intelligence, which has been extensively studied for many decades

  • We are interested in a fundamental problem in NLP, namely named entity recognition (NER) and mention detection (MD)

  • Different from previous ways to use a set of bits to indicate whether a word is in gazetteer or not, they have encoded a match in BIOES (Begin, Inside, Outside, End, Single) annotation, which captures positional information. Enough, none of these recent successes in NER was achieved by a vanilla recurrent neural networks (RNNs). These successes are often established by sophisticated models combining convolutional neural networks (CNNs), LSTMs and conditional random fields (CRFs) in certain ways

Read more

Summary

Introduction

Natural language processing (NLP) plays an important role in artificial intelligence, which has been extensively studied for many decades. A word segment will be examined individually based on the underlying segment itself and its left and right contexts in the sentence so as to determine whether this word segment is a valid named entity and the corresponding label if it is This approach conforms to the way human resolves an NER problem. The left and the right contexts for each word segment are encoded by FOFE method, and a simple neural network can be trained to make a precise recognition for each individual word segment based on the fixed-size presentation of the contextual information. This FOFE-based local detection approach is more appealing to NER and MD. Our proposed method has yielded strong performance in all of these examined tasks

Related Work
Deep Feedforward Neural Networks
Fixed-size Ordinally Forgetting Encoding
Character-level Models in NLP
FOFE-based Local Detection for NER
Word-level Features
Character-level Features
Training and Decoding Algorithm
Second-Pass Augmentation
Experiments
CoNLL 2003 NER task
KBP2015 EDL Task
KBP2016 EDL task
Data Description
Effect of various training data
The official trilingual EDL performance in KBP2016
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.