Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation

Dongha Choi ,Hongseok Choi ,Hyunju Lee

doi:10.48448/zjcr-d904

Abstract

Since the development and wide use of pretrained language models (PLMs), several approaches have been applied to boost their performance on downstream tasks in specific domains, such as biomedical or scientific domains. Additional pre-training with in-domain texts is the most common approach for providing domain-specific knowledge to PLMs. However, these pre-training methods require considerable in-domain data and training resources and a longer training time. Moreover, the training must be re-performed whenever a new PLM emerges. In this study, we propose a domain knowledge transferring (DoKTra) framework for PLMs without additional in-domain pretraining. Specifically, we extract the domain knowledge from an existing in-domain pretrained language model and transfer it to other PLMs by applying knowledge distillation. In particular, we employ activation boundary distillation, which focuses on the activation of hidden neurons. We also apply an entropy regularization term in both teacher training and distillation to encourage the model to generate reliable output probabilities, and thus aid the distillation. By applying the proposed DoKTra framework to downstream tasks in the biomedical, clinical, and financial domains, our student models can retain a high percentage of teacher performance and even outperform the teachers in certain tasks. Our code is available at https://github.com/DMCB-GIST/DoKTra.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Parameter-Efficient Domain Knowledge Integration from Multiple Sources for Biomedical Pre-trained Language Models
...
-
, et. al. ...
23 Oct 2021
23 Oct 2021

Parameter-Efficient Domain Knowledge Integration from Multiple Sources for Biomedical Pre-trained Language Models
Qiuhao Lu ... Dejing Dou
-
Qiuhao Lu, et. al.Qiuhao Lu ... Dejing Dou
01 Jan 2020
01 Jan 2020

KEBLM: Knowledge-Enhanced Biomedical Language Models
Tuan Manh Lai ... Heng Ji
Journal of Biomedical Informatics | VOL. 143
Tuan Manh Lai, et. al.Tuan Manh Lai ... Heng Ji
19 May 2023
Journal of Biomedical Informatics | VOL. 143

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Tran Vu Pham
-
An Pha Le, et. al.An Pha Le ... Tran Vu Pham
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation

Abstract

Talk to us

Similar Papers