Area Efficient Compression for Floating-Point Feature Maps in Convolutional Neural Network Accelerators

Bai-Kui Yan,Shanq-Jang Ruan

doi:10.1109/tcsii.2022.3213847

Abstract

Since Convolutional neural networks (CNNs) need massive computing resources, lots of computing architectures are proposed to improve the throughput and energy efficiency of the computing. However, those computing architectures need high data movement between the chip and off-chip memories, which causes high energy consumption on the off-chip memory; thus, the feature map (fmap) compression has been discussed for reducing the data movement. Therefore, the design of fmap compression becomes one of the main researches on CNN accelerator for energy efficiency on the off-chip memory. In this brief, we proposed floating-point (FP) fmap compression for a hardware accelerator which includes hardware design and a compression algorithm. This can apply quantization methods such as ternary neural quantization (TTQ), which only quantized weights with little or no degradation in accuracy and reduces the computation cost. In addition to the zero compression, we also compress nonzero values in the fmap based on the FP format. The compression algorithm achieves low area overhead and a similar compression ratio compared with the state-of-the-art on ILSVRC 2012 dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Area Efficient Compression for Floating-Point Feature Maps in Convolutional Neural Network Accelerators

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems II: Express Briefs

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems II: Express Briefs	Publication Date: Feb 1, 2023
Citations: 2

Similar Papers

High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic
Xiaocong Lian ... Zhenyu Liu
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 27
Xiaocong Lian, et. al.Xiaocong Lian ... Zhenyu Liu
01 Aug 2019
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 27

Stick Buffer Cache v2: Improved Input Feature Map Cache for Reducing off-chip Memory Traffic in CNN Accelerators
Rastislav Struharik ... Vuk Vranjkovic
-
Rastislav Struharik, et. al.Rastislav Struharik ... Vuk Vranjkovic
01 Nov 2019
01 Nov 2019

Striping input feature map cache for reducing off-chip memory traffic in CNN accelerators
Rastislav Struharik ... Vuk Vranjković
Telfor Journal | VOL. 12
Rastislav Struharik, et. al.Rastislav Struharik ... Vuk Vranjković
01 Jan 2020
Telfor Journal | VOL. 12

Optimizing FPGA-based CNN accelerator for energy efficiency with an extended Roofline model
Sayed Omid Ayat ... Mohamed Khalil-Hani
TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES | VOL. 26
Sayed Omid Ayat, et. al.Sayed Omid Ayat ... Mohamed Khalil-Hani
30 Mar 2018
TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Area Efficient Compression for Floating-Point Feature Maps in Convolutional Neural Network Accelerators

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems II: Express Briefs