Discrepancy and Uncertainty Aware Denoising Knowledge Distillation for Zero-Shot Cross-Lingual Named Entity Recognition

Ling Ge,Chunming Hu,Jihong Liu,Hong Zhang,Guanghui Ma

doi:10.1609/aaai.v38i16.29762

Abstract

The knowledge distillation-based approaches have recently yielded state-of-the-art (SOTA) results for cross-lingual NER tasks in zero-shot scenarios. These approaches typically employ a teacher network trained with the labelled source (rich-resource) language to infer pseudo-soft labels for the unlabelled target (zero-shot) language, and force a student network to approximate these pseudo labels to achieve knowledge transfer. However, previous works have rarely discussed the issue of pseudo-label noise caused by the source-target language gap, which can mislead the training of the student network and result in negative knowledge transfer. This paper proposes an discrepancy and uncertainty aware Denoising Knowledge Distillation model (DenKD) to tackle this issue. Specifically, DenKD uses a discrepancy-aware denoising representation learning method to optimize the class representations of the target language produced by the teacher network, thus enhancing the quality of pseudo labels and reducing noisy predictions. Further, DenKD employs an uncertainty-aware denoising method to quantify the pseudo-label noise and adjust the focus of the student network on different samples during knowledge distillation, thereby mitigating the noise's adverse effects. We conduct extensive experiments on 28 languages including 4 languages not covered by the pre-trained models, and the results demonstrate the effectiveness of our DenKD.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Discrepancy and Uncertainty Aware Denoising Knowledge Distillation for Zero-Shot Cross-Lingual Named Entity Recognition

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

Knowledge Transfer via Dense Cross-Layer Mutual-Distillation
Anbang Yao ... Dawei Sun
-
Anbang Yao, et. al.Anbang Yao ... Dawei Sun
01 Jan 2020
01 Jan 2020

Block change learning for knowledge distillation
Hyunguk Choi ... Moongu Jeon
Information Sciences | VOL. 513
Hyunguk Choi, et. al.Hyunguk Choi ... Moongu Jeon
01 Nov 2019
Information Sciences | VOL. 513

Deep Generative Knowledge Distillation by Likelihood Finetuning
Jingru Li ... Xiaofeng Chen
IEEE Access | VOL. 11
Jingru Li, et. al.Jingru Li ... Xiaofeng Chen
01 Jan 2023
IEEE Access | VOL. 11

A General Dynamic Knowledge Distillation Method for Visual Analytics.
Zhigang Tu ... Xuan Xiao
IEEE Transactions on Image Processing | VOL. PP
Zhigang Tu, et. al.Zhigang Tu ... Xuan Xiao
01 Jan 2021
IEEE Transactions on Image Processing | VOL. PP

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discrepancy and Uncertainty Aware Denoising Knowledge Distillation for Zero-Shot Cross-Lingual Named Entity Recognition

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence