Dynamic Layer Skipping for Large Language Models on Natural Language Understanding Tasks and Machine Translation Using Reinforcement Learning

Wei Xu,Xiaodong Jin

doi:10.54097/wy0g8m89

Abstract

Large Language Models (LLMs) demonstrate remarkable proficiency in various natural language processing (NLP) tasks. However, their extensive size, resulting from the inclusion of billions of parameters across multiple layers, presents significant challenges regarding storage, training, and inference. Traditional methodologies such as model pruning and distillation are employed to decrease the size of these models, but these techniques often result in a compromise on performance retention. In this work, we propose a novel framework that uses dynamic layer skipping for different samples to accelerate the inference speed of LLMs. First, we add an adapter layer at each transformer layer to predict whether to skip the next layer or not, and we propose layer skip pretraining to recover the model’s performance. Second, we propose using reinforcement learning (RL) to optimize the model and design several strategies to stabilize the training. Extensive experiments on four natural language understanding (NLU) datasets and three machine translation datasets and ablation studies show that our method achieves SOTA performance among layer skipping methods on LLMs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dynamic Layer Skipping for Large Language Models on Natural Language Understanding Tasks and Machine Translation Using Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Frontiers in Computing and Intelligent Systems

Lead the way for us

Journal: Frontiers in Computing and Intelligent Systems	Publication Date: Sep 26, 2024
License type: CC BY-NC 4.0

Similar Papers

Large language models for biomedicine: foundations, opportunities, challenges, and best practices.
Satya S Sahoo ... Yanshan Wang
Journal of the American Medical Informatics Association : JAMIA | VOL. 31
Satya S Sahoo, et. al.Satya S Sahoo ... Yanshan Wang
24 Apr 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. 31

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang ... Ruixiang Tang
ACM Transactions on Knowledge Discovery from Data | VOL. 18
Jingfeng Yang, et. al.Jingfeng Yang ... Ruixiang Tang
26 Apr 2024
ACM Transactions on Knowledge Discovery from Data | VOL. 18

Testing and Evaluation of Health Care Applications of Large Language Models
Suhana Bedi ... Nigam H Shah
JAMA | VOL. 333
Suhana Bedi, et. al.Suhana Bedi ... Nigam H Shah
15 Oct 2024
JAMA | VOL. 333

Use of SNOMED CT in Large Language Models: Scoping Review.
Eunsuk Chang ... Sumi Sung
JMIR medical informatics | VOL. 12
Eunsuk Chang, et. al.Eunsuk Chang ... Sumi Sung
07 Oct 2024
JMIR medical informatics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic Layer Skipping for Large Language Models on Natural Language Understanding Tasks and Machine Translation Using Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Frontiers in Computing and Intelligent Systems