Abstract
Abstract Semantic annotation of named entities for enriching unstructured content is a critical step in development of Semantic Web and many Natural Language Processing applications. To this end, this paper addresses the named entity disambiguation problem that aims at detecting entity mentions in a text and then linking them to entries in a knowledge base. In this paper, we propose a hybrid method, combining heuristics and statistics, for named entity disambiguation. The novelty is that the disambiguation process is incremental and includes several rounds that filter the candidate referents, by exploiting previously identified entities and extending the text by those entity attributes every time they are successfully resolved in a round. Experiments are conducted to evaluate and show the advantages of the proposed method. The experiment results show that our approach achieves high accuracy and can be used to construct a robust entity disambiguation system.
Highlights
In Information Extraction (IE) and Natural Language Processing (NLP) areas, named entities (NE) are people, organizations, locations, and others that are referred to by proper names
For the text “About three-quarters of white, college-educated men age over 65 use the Internet, says Susannah Fox, [...] John McCain is an outlier when you compare him to his peers, Fox says.”, there are 164 entities in the Wikipedia version used with the same name “Fox”
Due to the aforementioned possible error of a named entity recognition module splitting a name into two separate ones, we introduce the notion of partially correct mappings
Summary
In Information Extraction (IE) and Natural Language Processing (NLP) areas, named entities (NE) are people, organizations, locations, and others that are referred to by proper names. The name “John McCarthy” in different occurrences may refer to different NEs such as a computer scientist from Stanford University, a linguist from University of Massachusetts Amherst, an Australian ambassador, a British journalist who was kidnapped by Iranian terrorists in Lebanon in April 1986, etc Such ambiguity makes identification of NEs more difficult and raises NE disambiguation problem (NED) as one of the main challenges to research in the Semantic Web and in areas of natural language processing in general. The proposed method is rule-based and statistical-based It utilizes NEs and related terms co-occurring with the target entity in a text and Wikipedia for disambiguation because the intuition is that these respectively convey its relationship and attributes. We use the terms name and mention interchangeably, as well as for the terms entity and referent
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Computational Intelligence Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.