Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Zizheng Ji,Tingting Shen,Lin Dai,Jin Pang

doi:10.1109/access.2020.2994247

Abstract

Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence to their correct references in a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose a novel pre-training NED model. Especially, the proposed pre-training NED model consists of: (i) concept-enhanced pre-training, aiming at identifying valid lexical semantic relations with the concept semantic constraints derived from external resource Probase; and (ii) masked entity language model, aiming to train the contextualized embedding by predicting randomly masked entities based on words and non-masked entities in the given input-text. Therefore, the proposed pre-training NED model could merge the advantage of pre-training mechanism for generating contextualized embedding with the superiority of the lexical knowledge (e.g., concept knowledge emphasized here) for understanding language semantic. We conduct experiments on the CoNLL dataset and TAC dataset, and various datasets provided by GERBIL platform. The experimental results demonstrate that the proposed model achieves significantly higher performance than previous models.

Highlights

Named Entity Disambiguation (NED) is important for various Natural Language Processing (NLP) tasks such as question answering and dialog systems [1]–[3]
To enhance the representation ability of pre-training mechanism, this paper introduces extra lexical knowledge, which has been proved to be effective in helping understanding semantic in many NLP tasks [12]–[14] and could be combined with the distributional knowledge
(ii) A novel masked-entity language model (MELM) is introduced here, aiming to train the contextualized embedding model by predicting randomly masked entities based on the words and non-masked entities in the given input-text

Summary

Introduction

Named Entity Disambiguation (NED) is important for various Natural Language Processing (NLP) tasks such as question answering and dialog systems [1]–[3]. Current neural network based approaches have advanced the state-ofthe-art results on NED task [4]–[8], they failed to model the complex semantic relationships, and multiple signals (i.e., words and entities etc.,) can not be fully interplayed in their architectures. Language model pre-training has been shown to be effective for improving many NLP tasks [9]–[11], relying on its ability of representing complex context. This study tries to test the effectiveness of the pre-training contextualized embeddings for NED task. We describe a novel unsupervised pre-training model for words and entities towards NED task. Conventional unsupervised pre-training models have been shown to facilitate a wide range of downstream applications, still encode only the distributional knowledge, The associate editor coordinating the review of this manuscript and approving it for publication was Wenbing Zhao

Objectives

Methods

Findings

Discussion

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Knowledge graph extension with a pre-trained language model via unified learning method
Bonggeun Choi ... Youngjoong Ko
Knowledge Based Systems | VOL. 262
Bonggeun Choi, et. al.Bonggeun Choi ... Youngjoong Ko
07 Jan 2023
Knowledge Based Systems | VOL. 262

MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
Bonggeun Choi ... Daesik Jang
IEEE access : practical innovations, open solutions | VOL. 9
Bonggeun Choi, et. al.Bonggeun Choi ... Daesik Jang
01 Jan 2020
IEEE access : practical innovations, open solutions | VOL. 9

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi ... Fengyu Cai
-
Fei Mi, et. al.Fei Mi ... Fengyu Cai
01 Jan 2020
01 Jan 2020

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

-

21 Oct 2021
21 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions