Enhancing Named Entity Recognition in Twitter Messages Using Entity Linking

Ikuya Yamada,Hideaki Takeda,Yoshiyasu Takefuji

doi:10.18653/v1/w15-4320

Abstract

In this paper, we describe our approach for Named Entity Recognition in Twitter, a shared task for ACL 2015 Workshop on Noisy User-generated Text (Baldwin et al., 2015). Because of the noisy, short, and colloquial nature of Twitter, the performance of Named Entity Recognition (NER) degrades significantly. To address this problem, we propose a novel method to enhance the performance of the Twitter NER task by using Entity Linking which is a method for detecting entity mentions in text and resolving them to corresponding entries in knowledge bases such as Wikipedia. Our method is based on supervised machine-learning and uses the highquality knowledge obtained from several open knowledge bases. In comparison with the other systems proposed for this shared task, our method achieved the best performance.

Highlights

Named Entity Recognition (NER) refers to the task of identifying mentions of entities within text
Guo et al (Guo et al, 2013) recently revealed that the main failures of Twitter Entity Linking (EL) are caused while detecting entity mentions from text, because existing EL methods usually address the mention detection task by using external NER software whose performance is unreliable when processing tweets
The main objective of this study is to investigate the possibility of enhancing the performance of Twitter NER by using an end-to-end EL

Summary

Introduction

Named Entity Recognition (NER) refers to the task of identifying mentions of entities (e.g., persons, locations, organizations) within text. Entity Linking (EL) refers to the task of detecting textual entity mentions and linking them to corresponding entries within knowledge bases (e.g., Wikipedia, DBpedia (Auer et al, 2007), Freebase (Bollacker et al, 2008)). Because of the recent emergence of large online knowledge bases (KB), EL has recently gained significant attention. Guo et al (Guo et al, 2013) recently revealed that the main failures of Twitter EL are caused while detecting entity mentions from text, because existing EL methods usually address the mention detection task by using external NER software whose performance is unreliable when processing tweets.

Objectives

Methods

Results