SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

Tianyu Yu,Hai-Tao Zheng,Pengjun Xie,Fei Huang,Kewei Tu,Chengyue Jiang,Yong Jiang,Yangning Li,Yinghui Li,Jiong Cai,Ningyu Zhang,Wei Liu,Xiaobin Wang,Shen Huang,Chao Lou

doi:10.1609/aaai.v38i17.29917

Abstract

Large language models (LLMs) have shown impressive abilities for open-domain NLP tasks. However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format. Their performances on NLU tasks are highly related to prompts or demonstrations and are shown to be poor at performing several representative NLU tasks, such as event extraction and entity typing. To this end, we present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding. We express all NLU tasks with two atomic tasks, which define fixed instructions to restrict the input and output format but still ``open'' for arbitrarily varied label sets. The model is first instruction-tuned with extremely fine-grained labeled data synthesized by ChatGPT and then further fine-tuned by 233 different atomic tasks from 152 datasets across various domains. The experimental results show that SeqGPT has decent classification and extraction ability, and is capable of performing language understanding tasks on unseen domains. We also conduct empirical studies on the scaling of data and model size as well as on the transfer across tasks. Our models are accessible at https://github.com/Alibaba-NLP/SeqGPT.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

Testing and Evaluation of Health Care Applications of Large Language Models
Suhana Bedi ... Nigam H Shah
JAMA | VOL. 333
Suhana Bedi, et. al.Suhana Bedi ... Nigam H Shah
15 Oct 2024
JAMA | VOL. 333

EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy
Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
-
Rouzbeh Behnia, et. al.Rouzbeh Behnia ... Mohammadreza Reza Ebrahimi
01 Nov 2022
01 Nov 2022

Comparative Study of Multiclass Text Classification in Research Proposals Using Pretrained Language Models
Eunchan Lee ... Sangtae Ahn
Applied Sciences | VOL. 12
Eunchan Lee, et. al.Eunchan Lee ... Sangtae Ahn
29 Apr 2022
Applied Sciences | VOL. 12

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency
...
-
, et. al. ...
11 May 2022
11 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence