Named Entity Extraction for Knowledgebase Enhancement

Priya Radhakrishnan

doi:10.1145/3274784.3274807

Abstract

Past decade witnessed an explosive growth in the amount of unstructured data, especially in the public domain, mainly due to Web 2.0 and social media. This led to the creation of applications, called information extractors, that extract structured information from unstruc- tured data. The extracted information is stored in a Knowledge Base (KB). KB stores facts about entities like name, type and other attributes. My PhD thesis entitled 'Named Entity Extraction for Knowledgebase Enhancement' deals with information extraction on named entities with the purpose of enhancing a KB. The enhanced KB is in turn used by the information extraction task to refine the extraction process. Thus, KB provides structure and guidance to the extraction task, and gets enhanced by the results of the extraction task. Here we see that the tasks of entity extraction and KB enhancement are mutually dependent and mutually beneficial. Hence in my research I propose methods to enhance both the tasks, in an effort to build a strong and sound named entity extraction system. Named Entity Extraction, also known as Entity Linking (EL) in scientific literature, is the task of determining the identity of entities mentioned in text. EL helps automatic extraction of structured information about entities from unstructured data, which is stored in the KB. EL consists of Mention Detection and Entity Disambiguation. In my research, I propose methods to enhance mention detection, entity disambiguation and KB enhancement.

Full Text