GLM: General Language Model Pretraining with Autoregressive Blank Infilling

Yujie Qian ,Jiezhong Qiu ,Zhengxiao Du ,Zhilin Yang ,Xiao Liu ,Ming Ding ,Jie Tang

doi:10.48448/8m9n-nd53

Abstract

There have been various types of pretraining architectures including autoencoding models (e.g., BERT), autoregressive models (e.g., GPT), and encoder-decoder models (e.g., T5). However, none of the pretraining frameworks performs the best for all tasks of three main categories including natural language understanding (NLU), unconditional generation, and conditional generation. We propose a General Language Model (GLM) based on autoregressive blank infilling to address this challenge. GLM improves blank filling pretraining by adding 2D positional encodings and allowing an arbitrary order to predict spans, which results in performance gains over BERT and T5 on NLU tasks. Meanwhile, GLM can be pretrained for different types of tasks by varying the number and lengths of blanks. On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1.25× parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Comparative Study of Multiclass Text Classification in Research Proposals Using Pretrained Language Models
Eunchan Lee ... Sangtae Ahn
Applied Sciences | VOL. 12
Eunchan Lee, et. al.Eunchan Lee ... Sangtae Ahn
29 Apr 2022
Applied Sciences | VOL. 12

BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model
Hongyi Yuan ... Sheng Yu
-
Hongyi Yuan, et. al.Hongyi Yuan ... Sheng Yu
01 Jan 2021
01 Jan 2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
Wenxuan Zhou ... Bill Yuchen Lin
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35
Wenxuan Zhou, et. al.Wenxuan Zhou ... Bill Yuchen Lin
18 May 2021
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35

EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy
Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
-
Rouzbeh Behnia, et. al.Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
01 Nov 2022
01 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

Abstract

Talk to us

Similar Papers