Abstract

Entity disambiguation is a fundamental task in natural language processing and computational linguistics. Given a query consisting of a mention (name string) and a background document, entity disambiguation aims at linking the mention to an entity from a reference knowledge base such as Wikipedia. A main challenge of this task is how to effectively represent the meaning of the mention and the entity, based on which the semantic relatedness between the mention and the entity could be conveniently measured. Towards this goal, we introduce computational models to effectively represent the mention and the entity in some vector space. We decompose the problem into subproblems and develop various neural network architectures, all of which are purely data‐driven and capable of learning continuous representations of the mention and the entity from data. To effectively train the neural network models, we explore a simple yet effective way that enables us to collect millions of training examples from Wikipedia without using any manual annotation. Empirical results on two benchmark datasets show that our approaches based on convolutional neural network and long short‐term memory consistently outperform top‐performed systems on both datasets. WIREs Data Mining Knowl Discov 2017, 7:e1215. doi: 10.1002/widm.1215This article is categorized under: Algorithmic Development > Text Mining Algorithmic Development > Web Mining

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call