Efficient label-free pruning and retraining for Text-VQA Transformers

Soon Chang Poh,Chee Seng Chan,Chee Kau Lim

doi:10.1016/j.patrec.2024.04.024

Abstract

Recent advancements in Scene Text Visual Question Answering (Text-VQA) employ autoregressive Transformers, showing improved performance with larger models and pre-training datasets. Although various pruning frameworks exist to simplify Transformers, many are integrated into the time-consuming training process. Researchers have recently explored post-training pruning techniques, which separate pruning from training and reduce time consumption. Some methods use gradient-based importance scores that rely on labeled data, while others offer retraining-free algorithms that quickly enhance pruned model accuracy. This paper proposes a novel gradient-based importance score that only necessitates raw, unlabeled data for post-training structured autoregressive Transformer pruning. Additionally, we introduce a Retraining Strategy (ReSt) for efficient performance restoration of pruned models of arbitrary sizes. We evaluate our approach on TextVQA and ST-VQA datasets using TAP, TAP†† and SaL‡-Base where all utilize autoregressive Transformers. On TAP and TAP†† , our pruning approach achieves up to 60% reduction in size with less than a 2.4% accuracy drop and the proposed ReSt retraining approach takes only 3 to 34 min, comparable to existing retraining-free techniques. On SaL‡-Base, the proposed method achieves up to 50% parameter reduction with less than 2.9% accuracy drop requiring only 1.19 h of retraining using the proposed ReSt approach. The code is publicly accessible at https://github.com/soonchangAI/LFPR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient label-free pruning and retraining for Text-VQA Transformers

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters

Lead the way for us

Similar Papers

Magnitude and Similarity Based Variable Rate Filter Pruning for Efficient Convolution Neural Networks
Deepak Ghimire ... Seong-Heum Kim
Applied Sciences | VOL. 13
Deepak Ghimire, et. al.Deepak Ghimire ... Seong-Heum Kim
27 Dec 2022
Applied Sciences | VOL. 13

Bi-RDNet: Performance Enhancement for Remote Sensing Scene Classification with Rotational Duplicate Layers
Erdem Safa Akkul ... Behçet Uğur Töreyin
-
Erdem Safa Akkul, et. al.Erdem Safa Akkul ... Behçet Uğur Töreyin
01 Jan 2020
01 Jan 2020

Ternary MobileNets via Per-Layer Hybrid Filter Banks
Dibakar Gope ... Jesse Beu
-
Dibakar Gope, et. al.Dibakar Gope ... Jesse Beu
24 Dec 2019
24 Dec 2019

Deep learning inspired prognostics scheme for applications generating big data
R Krishnan ... V.A Samaranayake
-
R Krishnan, et. al.R Krishnan ... V.A Samaranayake
01 May 2017
01 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient label-free pruning and retraining for Text-VQA Transformers

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters