Dynamically reconfigurable variable-precision sparse-dense matrix acceleration in Tensorflow Lite

Jose Nunez-Yanez,Andres Otero,Eduardo De La Torre

doi:10.1016/j.micpro.2023.104801

Jose Nunez-Yanez, Andres Otero + Show 1 more

Open Access

https://doi.org/10.1016/j.micpro.2023.104801

Copy DOI

Abstract

In this paper, we present a dynamically reconfigurable hardware accelerator called FADES (Fused Architecture for DEnse and Sparse matrices). The FADES design offers multiple configuration options that trade off parallelism and complexity using a dataflow model to create four stages that read, compute, scale and write results. FADES is mapped to the programmable logic (PL) and integrated with the TensorFlow Lite inference engine running on the processing system (PS) of a heterogeneous SoC device. The accelerator is used to compute the tensor operations, while the dynamically reconfigurable approach can be used to switch precision between int8 and float modes. This dynamic reconfiguration enables better performance by allowing more cores to be mapped to the resource-constrained device and lower power consumption compared with supporting both arithmetic precisions simultaneously. We compare the proposed hardware with a high-performance systolic architecture for dense matrices obtaining 25% better performance in dense mode with half the DSP blocks in the same technology. In sparse mode, we show that the core can outperform dense mode even at low sparsity levels, and a single-core achieves up to 20x acceleration over the software-optimized NEON RUY library.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Microprocessors and Microsystems	Publication Date: Feb 23, 2023
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Dynamically reconfigurable variable-precision sparse-dense matrix acceleration in Tensorflow Lite

Abstract

Talk to us

Similar Papers

More From: Microprocessors and Microsystems

Lead the way for us

Similar Papers

Fused Architecture for Dense and Sparse Matrix Processing in TensorFlow Lite
Jose Nunez-Yanez
IEEE Micro | VOL. 42
Jose Nunez-YanezJose Nunez-Yanez
01 Nov 2022
IEEE Micro | VOL. 42

Multicast routing in dense and sparse modes: simulation study of tradeoffs and dynamics
Liming Wei ... D Estrin
-
Liming Wei, et. al. Liming Wei ... D Estrin
20 Sep 1995
20 Sep 1995

A dynamically reconfigurable parallel pixel processing system
Daniel Llamocca ... Alonzo Vera
-
Daniel Llamocca, et. al.Daniel Llamocca ... Alonzo Vera
01 Aug 2009
01 Aug 2009

Scaling up data-parallel analytics platforms: Linear algebraic operation cases
Luna Xu ... Ali R Butt
-
Luna Xu, et. al.Luna Xu ... Ali R Butt
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamically reconfigurable variable-precision sparse-dense matrix acceleration in Tensorflow Lite

Abstract

Talk to us

Similar Papers

More From: Microprocessors and Microsystems