Abstract

Modern techniques of pre-training and fine-tuning have significantly improved the performance of models on downstream tasks. However, this improvement faces challenges when pre-trained models encounter the necessity to adapt sequentially to multiple downstream tasks within the context of continuously shifting training data. In this study, we aim to leverage the general capabilities of pre-trained models for knowledge sharing across different tasks while endow them with the capability for continuous learning. To this end, we propose a Hypernetwork-based Parameter Efficient Fine-Tuning (HyperPEFT) framework. Utilizing a pre-trained Vision Transformer (ViT) as the backbone, HyperPEFT is capable of incorporating various PEFT techniques, enabling the pre-trained ViT to adapt to diverse downstream tasks. The core of our method lies in the application of hypernetworks, which efficiently encapsulate task-specific information, significantly reducing task interference and fortifying the model against catastrophic forgetting. The adoption PEFT techniques allows for precise adjustments to the pre-trained models, enhancing their performance for each specific task. Moreover, this strategy employs a shared hypernetwork to make task-specific adjustments, thereby facilitating knowledge sharing across different tasks for pre-trained models. The extensive experiments reveal that our method effectively mitigates catastrophic forgetting, outperforms comparison methods, and uncovers latent associations among tasks. Overall, this study introduces a unified strategy that synergistically blends the general capabilities of pre-trained models with the necessary adaptability for continual learning scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call