Abstract
Entity linking refers to the task of aligning mentions of entities in the text to their corresponding entries in a specific knowledge base, which is of great significance for many natural language process applications such as semantic text understanding and knowledge fusion. The pivotal of this problem is how to make effective use of contextual information to disambiguate mentions. Moreover, it has been observed that, in most cases, mention has similar or even identical strings to the entity it refers to. To prevent the model from linking mentions to entities with similar strings rather than the semantically similar ones, in this paper, we introduce the advanced language representation model called BERT (Bidirectional Encoder Representations from Transformers) and design a hard negative samples mining strategy to fine-tune it accordingly. Based on the learned features, we obtain the valid entity through computing the similarity between the textual clues of mentions and the entity candidates in the knowledge base. The proposed hard negative samples mining strategy benefits entity linking from the larger, more expressive pre-trained representations of BERT with limited training time and computing sources. To the best of our knowledge, we are the first to equip entity linking task with the powerful pre-trained general language model by deliberately tackling its potential shortcoming of learning literally, and the experiments on the standard benchmark datasets show that the proposed model yields state-of-the-art results.
Highlights
The Internet has entered the era of information explosion, and the problem of overloading information has brought enormous challenges to retrieval
The main contributions of this work are: (i) We propose an entity linking model based on bidirectional encoder representations from transformers (BERT), which introduces the idea of pre-training language model into entity linking study
PARAMETER SETTING The model in this work has two networks to train, including the fine-tuning of BERT and the entity disambiguation module, we will detail the parameter setting separately
Summary
The Internet has entered the era of information explosion, and the problem of overloading information has brought enormous challenges to retrieval. Entity Linking (EL) aims at solving such problem, whose task is to associate a specific textual mention of an entity in a given document with an entry in a large target catalog of entities, commonly referred to a knowledge base (KB). Through EL, we can eliminate inconsistencies such as entity conflicts and unclear directions in heterogeneous data, a large-scale unified knowledge base can be created to help machines to understand heterogeneous data from multiple sources and form high-quality knowledge. This makes EL one of the primary tasks in the Knowledge-Base Population (KBP) track at the Text Analysis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.