CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency

Moritz Scherer,Georg Rutishauser,Lukas Cavigelli,Luca Benini

doi:10.1109/tcad.2021.3075420

Moritz Scherer, Georg Rutishauser + Show 2 more

Open Access

https://doi.org/10.1109/tcad.2021.3075420

Copy DOI

Abstract

We present a 3.1 POp/s/W fully digital hardware accelerator for ternary neural networks (TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing noncomputational energy and switching activity so that dynamic power spent on storing (locally or globally) intermediate results is minimized. This is achieved by: 1) a data-path architecture completely unrolled in the feature map and filter dimensions to reduce switching activity by favoring silencing over iterative computation and maximizing data reuse; 2) targeting TNNs which, in contrast to binary NNs, allow for sparse weights that reduce switching activity; and 3) introducing an optimized training method for higher sparsity of the filter weights, resulting in a further reduction of the switching activity. Compared with state-of-the-art accelerators, CUTIE achieves greater or equal accuracy while decreasing the overall core inference energy cost by a factor of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$4.8\times $ </tex-math></inline-formula> – <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$21\times $ </tex-math></inline-formula> .

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Apr 1, 2022
Citations: 8	License type: other-oa

R Discovery Prime

R Discovery Prime

CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Similar Papers

Quotient prediction for low power division
Prakash Krishnamoorthy ... Ramesh Tekumalla
-
Prakash Krishnamoorthy, et. al.Prakash Krishnamoorthy ... Ramesh Tekumalla
01 Sep 2013
01 Sep 2013

FPGA Prototyping of Low-Precision Zero-Skipping Accelerator for Neural Networks
Dongyoung Kim ... Soobeom Kim
-
Dongyoung Kim, et. al.Dongyoung Kim ... Soobeom Kim
01 Oct 2018
01 Oct 2018

A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks (Abstract Only)
Yixing Li ... Hao Yu
-
Yixing Li, et. al.Yixing Li ... Hao Yu
22 Feb 2017
A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks (Abstract Only)
Yixing Li ... Hao Yu

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip
Lei Gong ... Xi Li
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 37
Lei Gong, et. al.Lei Gong ... Xi Li
01 Nov 2018
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems