NAMED ENTITY DISAMBIGUATION: A HYBRID APPROACH

Hien T Nguyen,Tru H Cao

doi:10.1080/18756891.2012.747661

Abstract

Abstract Semantic annotation of named entities for enriching unstructured content is a critical step in development of Semantic Web and many Natural Language Processing applications. To this end, this paper addresses the named entity disambiguation problem that aims at detecting entity mentions in a text and then linking them to entries in a knowledge base. In this paper, we propose a hybrid method, combining heuristics and statistics, for named entity disambiguation. The novelty is that the disambiguation process is incremental and includes several rounds that filter the candidate referents, by exploiting previously identified entities and extending the text by those entity attributes every time they are successfully resolved in a round. Experiments are conducted to evaluate and show the advantages of the proposed method. The experiment results show that our approach achieves high accuracy and can be used to construct a robust entity disambiguation system.

Highlights

In Information Extraction (IE) and Natural Language Processing (NLP) areas, named entities (NE) are people, organizations, locations, and others that are referred to by proper names
For the text “About three-quarters of white, college-educated men age over 65 use the Internet, says Susannah Fox, [...] John McCain is an outlier when you compare him to his peers, Fox says.”, there are 164 entities in the Wikipedia version used with the same name “Fox”
Due to the aforementioned possible error of a named entity recognition module splitting a name into two separate ones, we introduce the notion of partially correct mappings

Summary

Introduction

In Information Extraction (IE) and Natural Language Processing (NLP) areas, named entities (NE) are people, organizations, locations, and others that are referred to by proper names. The name “John McCarthy” in different occurrences may refer to different NEs such as a computer scientist from Stanford University, a linguist from University of Massachusetts Amherst, an Australian ambassador, a British journalist who was kidnapped by Iranian terrorists in Lebanon in April 1986, etc Such ambiguity makes identification of NEs more difficult and raises NE disambiguation problem (NED) as one of the main challenges to research in the Semantic Web and in areas of natural language processing in general. The proposed method is rule-based and statistical-based It utilizes NEs and related terms co-occurring with the target entity in a text and Wikipedia for disambiguation because the intuition is that these respectively convey its relationship and attributes. We use the terms name and mention interchangeably, as well as for the terms entity and referent

Background

Wikipedia

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computational Intelligence Systems	Publication Date: Jan 1, 2012
Citations: 42	License type: cc-by

R Discovery Prime

R Discovery Prime

NAMED ENTITY DISAMBIGUATION: A HYBRID APPROACH

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems

Lead the way for us

Similar Papers

Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach
Hien T Nguyen ... Tru H Cao
-
Hien T Nguyen, et. al.Hien T Nguyen ... Tru H Cao
08 Dec 2008
08 Dec 2008

Research on the Application and Development of Next-Generation Semantic Web in Cloud Environment
Fan Jiaolian
-
Fan JiaolianFan Jiaolian
01 Jan 2013
01 Jan 2013

Web Application Lifecycle: Combining Important Actors and Factors in Web Development
Ricky Jiandy
Ultimatics : Jurnal Teknik Informatika | VOL. 7
Ricky JiandyRicky Jiandy
01 Jun 2015
Ultimatics : Jurnal Teknik Informatika | VOL. 7

Entity Recognition and Linking in Chinese Search Queries
Jinwei Yuan ... Hongfeng Yin
-
Jinwei Yuan, et. al.Jinwei Yuan ... Hongfeng Yin
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

NAMED ENTITY DISAMBIGUATION: A HYBRID APPROACH

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems