Online Embedding Compression for Text Classification Using Low Rank Matrix Factorization

Anish Acharya,Inderjit Dhillon,Rahul Goel,Angeliki Metallinou

doi:10.1609/aaai.v33i01.33016196

Abstract

Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training, to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Online Embedding Compression for Text Classification Using Low Rank Matrix Factorization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jul 17, 2019
Citations: 33

Similar Papers

A Comparative Study of Deep Learning Models for Natural Language Processing (NLP)
-
JOURNAL OF ALGEBRAIC STATISTICS | VOL. -
--
01 Jan 2020
JOURNAL OF ALGEBRAIC STATISTICS | VOL. -

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

NLPLego: Assembling Test Generation for Natural Language Processing Applications
Pin Ji ... Ruohao Zhang
ACM Transactions on Software Engineering and Methodology | VOL. -
Pin Ji, et. al.Pin Ji ... Ruohao Zhang
05 Oct 2024
ACM Transactions on Software Engineering and Methodology | VOL. -

Word Embedding for Bengali Language using Domain-related Corpus
Ashutosh Bandyopadhyay ... Jayashree Nair
-
Ashutosh Bandyopadhyay, et. al.Ashutosh Bandyopadhyay ... Jayashree Nair
26 Apr 2023
26 Apr 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Online Embedding Compression for Text Classification Using Low Rank Matrix Factorization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence