Abstract

Many name tagging approaches use local contextual information with much success, but can fail when the local context is ambiguous or limited. We present a new framework to improve name tagging by utilizing local, document-level, and corpus-level contextual information. For each word, we retrieve document-level context from other sentences within the same document and corpus-level context from sentences in other documents. We propose a model that learns to incorporate document-level and corpus-level contextual information alongside local contextual information via document-level and corpus-level attentions, which dynamically weight their respective contextual information and determines the influence of this information through gating mechanisms. Experiments on benchmark datasets show the effectiveness of our approach, which achieves state-of-the-art results for Dutch, German, and Spanish on the CoNLL-2002 and CoNLL-2003 datasets. We will make our code and pre-trained models publicly available for research purposes.

Highlights

  • The task of automatically identifying and classifying named entities in text, is often posed as a sentence-level sequence labeling problem where each token is labeled as being part of a name of a certain type or not (Chinchor and Robinson, 1997; Tjong Kim Sang and De Meulder, 2003)

  • We evaluate our methods on the CoNLL-2002 and CoNLL-2003 name tagging datasets (Tjong Kim Sang and De Meulder, 2003)

  • Vanilla Name Tagging Without any additional resources and supervision, the current state-ofthe-art name tagging model is the Bi-LSTMCRF network reported by Lample et al (2016) and Ma and Hovy (2016b), whose difference lies in using a LSTM or CNN to encode characters

Read more

Summary

Introduction

The task of automatically identifying and classifying named entities in text, is often posed as a sentence-level sequence labeling problem where each token is labeled as being part of a name of a certain type (e.g., location) or not (Chinchor and Robinson, 1997; Tjong Kim Sang and De Meulder, 2003). To utilize this additional information, we propose a model that, first, produces representations for each token that encode the local context from the query sentence as well as the documentlevel and corpus-level contexts from the retrieved sentences. We enhance this baseline model by adding documentlevel and corpus-level contextual information to the prediction process via our document-level and corpus-level attention mechanisms, respectively

Baseline
Document-level Attention
A Attentive Summation G Gated Summation
Tag Prediction
Dataset
Experimental Setup
Performance Comparison
Qualitative Analysis
Remaining Challenges
Related Work
Findings
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call