A Survey of Person Name Disambiguation on the Web

Agustin D Delgado,Raquel Martinez Unanue,Soto Montalvo,Victor Fresno

doi:10.1109/access.2018.2874891

Agustin D Delgado, Raquel Martinez Unanue + Show 2 more

Open Access

https://doi.org/10.1109/access.2018.2874891

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

Person name disambiguation on the Web (PNDW) consists of grouping the Web pages retrieved by a search engine when a person’s name is queried according to the individuals they refer to. This problem is of interest to the research community because Internet users often search for information about people on search engines, and also because people’s names are a very ambiguous type of named entity. In addition, the Web domain presents several challenges for natural language processing and information retrieval methods. In this paper, we classify PNDW systems according to their main characteristics: 1) features used to identify different individuals with the same name; 2) mathematical models used to represent the search results; 3) clustering algorithms used to group the Web pages; 4) methods used to address the impact of Web pages from social networking sites; and 5) methods used to deal with the multilingual nature of the Web. Also, we present the data sets most widely used to evaluate PNDW systems. Finally, we analyze the results obtained by the best PNDW systems in the literature.

Highlights

Person name disambiguation has received the interest from Natural Language Processing (NLP), Information Retrieval (IR) and Text Mining (TM) communities due to people names being a very ambiguous type of Named Entities (NEs)
1) EVALUATION METRICS The performance of PNDW systems has been measured with extrinsic evaluation metrics used in clustering problems for two main reasons: (i) PNDW corpora have associated gold standards annotated by experts; and (ii) PNDW has been formalized as a clustering problem
This is for two reasons: (i) these systems have been evaluated only in WEB PEOPLE SEARCH (WePS) corpora because the corpora University of Amsterdam (UvA) and MC4WePS are more recent, most PNDW systems have not been evaluated with them; and (ii) these systems have been trained with some of the WePS data sets in order to be evaluated with the other WePS collections

Summary

Introduction

Person name disambiguation has received the interest from Natural Language Processing (NLP), Information Retrieval (IR) and Text Mining (TM) communities due to people names being a very ambiguous type of Named Entities (NEs). Since 2009, the Text Analysis Conferences (TAC) have organized tasks about the entity linking problem, recently renamed as entity discovery and linking. The goal of this problem is to link mentions of an entity in a document to entities in a reference knowledge base, usually Wikipedia, or to detect new entities. He et al [1] and Grütze et al [2] have presented data sets for entity linking exclusively composed of person names. Person name disambiguation has been addressed in the news domain because people are often at the core of the events reported in the

Methods

Results

Discussion

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2018
Citations: 2	License type: CC BY 3.0

R Discovery Prime

A Survey of Person Name Disambiguation on the Web

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Graph-based Natural Language Processing and Information Retrieval
Rada Mihalcea ... Dragomir Radev
-
Rada Mihalcea, et. al.Rada Mihalcea ... Dragomir Radev
11 Apr 2011
11 Apr 2011

Opinion Mining and Sentiment Analysis: A Survey
Mohammad Sadegh Hajmohammadi ... Roliana Ibrahim
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 2
Mohammad Sadegh Hajmohammadi, et. al.Mohammad Sadegh Hajmohammadi ... Roliana Ibrahim
30 Jun 2012
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 2

Person name disambiguation on the web in a multilingual context
Agustín D Delgado ... Víctor Fresno
Information Sciences | VOL. 465
Agustín D Delgado, et. al.Agustín D Delgado ... Víctor Fresno
18 Jul 2018
Information Sciences | VOL. 465

Knowledge representation for intelligent information retrieval in experimental sciences
Natalya Fridman Noy
-
Natalya Fridman NoyNatalya Fridman Noy
10 May 2021
10 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

A Survey of Person Name Disambiguation on the Web

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE Access