On the Effectiveness of Parameter-Efficient Fine-Tuning

Zihao Fu,Anthony Man-Cho So,Lidong Bing,Nigel Collier,Wai Lam,Haoran Yang

doi:10.1609/aaai.v37i11.26505

Abstract

Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always yields an entirely new model for each task. Currently, many research works propose to only fine-tune a small portion of the parameters while keeping most of the parameters shared across different tasks. These methods achieve surprisingly good performance and are shown to be more stable than their corresponding fully fine-tuned counterparts. However, such kind of methods is still not well understood. Some natural questions arise: How does the parameter sparsity lead to promising performance? Why is the model more stable than the fully fine-tuned models? How to choose the tunable parameters? In this paper, we first categorize the existing methods into random approaches, rule-based approaches, and projection-based approaches based on how they choose which parameters to tune. Then, we show that all of the methods are actually sparse fine-tuned models and conduct a novel theoretical analysis of them. We indicate that the sparsity is actually imposing a regularization on the original model by controlling the upper bound of the stability. Such stability leads to better generalization capability which has been empirically observed in a lot of recent research works. Despite the effectiveness of sparsity grounded by our theory, it still remains an open problem of how to choose the tunable parameters. Currently, the random and rule-based methods do not utilize task-specific data information while the projection-based approaches suffer from the projection discontinuity problem. To better choose the tunable parameters, we propose a novel Second-order Approximation Method (SAM) which approximates the original problem with an analytically solvable optimization function. The tunable parameters are determined by directly optimizing the approximation function. We conduct extensive experiments on several tasks. The experimental results show that our proposed SAM model outperforms many strong baseline models and it also verifies our theoretical analysis. The source code of this paper can be obtained from https://github.com/fuzihaofzh/AnalyzeParameterEff\/icientFinetune .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the Effectiveness of Parameter-Efficient Fine-Tuning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 20

Similar Papers

An Investigation of Fine-tuning Pre-trained Model for MR-to-Text Generation
Ting Hu ... Christoph Meinel
-
Ting Hu, et. al.Ting Hu ... Christoph Meinel
01 Dec 2020
01 Dec 2020

Sentiment Analysis Methods for HPV VaccinesRelated Tweets Based on Transfer Learning.
Li Zhang ... Qing Cong
Healthcare | VOL. 8
Li Zhang, et. al.Li Zhang ... Qing Cong
28 Aug 2020
Healthcare | VOL. 8

Enhancing Ductal Carcinoma Classification Using Transfer Learning with 3D U-Net Models in Breast Cancer Imaging
Saman Khalil ... Zohaib Mushtaq
Applied Sciences | VOL. 13
Saman Khalil, et. al.Saman Khalil ... Zohaib Mushtaq
27 Mar 2023
Applied Sciences | VOL. 13

Expanding Large Pre-trained Unimodal Models with Multimodal Information Injection for Image-Text Multimodal Classification
Tao Liang ... Fengmao Lv
-
Tao Liang, et. al.Tao Liang ... Fengmao Lv
01 Jun 2022
01 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Effectiveness of Parameter-Efficient Fine-Tuning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence