Accelerating BERT Inference for Sequence Labeling via Early-Exit

Xiaonan Li

doi:10.48448/swej-yz43

Abstract

Both performance and efficiency are crucial factors for sequence labeling tasks in many real-world scenarios. Although the pre-trained models (PTMs) have significantly improved the performance of various sequence labeling tasks, their computational cost is expensive. To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks. However, existing early-exit mechanisms are specifically designed for sequence-level tasks, rather than sequence labeling. In this paper, we first propose a simple extension of sentence-level early-exit for sequence labeling tasks. To further reduce the computational cost, we also propose a token-level early-exit mechanism that allows partial tokens to exit early at different layers. Considering the local dependency inherent in sequence labeling, we employed a window-based criterion to decide for a token whether or not to exit. The token-level early-exit brings the gap between training and inference, so we introduce an extra self-sampling fine-tuning stage to alleviate it. The extensive experiments on three popular sequence labeling tasks show that our approach can save up to 66%∼75% inference cost with minimal performance degradation. Compared with competitive compressed models such as DistilBERT, our approach can achieve better performance under the same speed-up ratios of 2×, 3×, and 4×.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Improving sequence labeling with labeled clue sentences
Qianlong Wang ... Ruifeng Xu
Knowledge-Based Systems | VOL. 257
Qianlong Wang, et. al.Qianlong Wang ... Ruifeng Xu
07 Sep 2022
Knowledge-Based Systems | VOL. 257

A Self-Attention Based Joint Sequence Labeling Model
Eryong Wu ... Xiaoming Liu
-
Eryong Wu, et. al.Eryong Wu ... Xiaoming Liu
24 Jun 2022
24 Jun 2022

Switch Point biased Self-Training: Re-purposing Pretrained Models for Code-Switching
...
-
, et. al. ...
23 Oct 2021
23 Oct 2021

Switch Point biased Self-Training: Re-purposing Pretrained Models for Code-Switching
Parul Chopra ... Khyathi Raghavi Chandu
-
Parul Chopra, et. al.Parul Chopra ... Khyathi Raghavi Chandu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Abstract

Talk to us

Similar Papers