On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

Seo Yeon Park ,Cornelia Caragea

doi:10.48448/9rg2-sq22

Abstract

A well-calibrated neural model produces confidence (probability outputs) closely approximated by the expected accuracy. While prior studies have shown that mixup training as a data augmentation technique can improve model calibration on image classification tasks, little is known about using mixup for model calibration on natural language understanding (NLU) tasks. In this paper, we explore mixup for model calibration on several NLU tasks and propose a novel mixup strategy for pre-trained language models that improves model calibration further. Our proposed mixup is guided by both the Area Under the Margin (AUM) statistic and the saliency map of each sample. Moreover, we combine our mixup strategy with model miscalibration correction techniques (i.e., label smoothing and temperature scaling) and provide detailed analyses of their impact on our proposed mixup. We focus on systematically designing experiments on three NLU tasks: natural language inference, paraphrase detection, and commonsense reasoning. Our method achieves the lowest expected calibration error compared to strong baselines on both in-domain and out-of-domain test samples while maintaining competitive accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Comparative Study of Multiclass Text Classification in Research Proposals Using Pretrained Language Models
Eunchan Lee ... Sangtae Ahn
Applied Sciences | VOL. 12
Eunchan Lee, et. al.Eunchan Lee ... Sangtae Ahn
29 Apr 2022
Applied Sciences | VOL. 12

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
Wenxuan Zhou ... Bill Yuchen Lin
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35
Wenxuan Zhou, et. al.Wenxuan Zhou ... Bill Yuchen Lin
18 May 2021
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35

EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy
Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
-
Rouzbeh Behnia, et. al.Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
01 Nov 2022
01 Nov 2022

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding
Tianyu Yu ... Chao Lou
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Tianyu Yu, et. al.Tianyu Yu ... Chao Lou
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

Abstract

Talk to us

Similar Papers