LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators.

Abinand Nallathambi,Christin David Bose,Wilfried Haensch,Anand Raghunathan

doi:10.3389/frai.2024.1268317

Abinand Nallathambi, Christin David Bose + Show 2 more

Open Access

https://doi.org/10.3389/frai.2024.1268317

Copy DOI

Export

Save

Cite

Journal: Frontiers in artificial intelligence	Publication Date: Jan 1, 2024
License type: cc-by

Abstract
Full-Text
Similar Papers

Abstract

Listen

In-memory computing (IMC) with non-volatile memories (NVMs) has emerged as a promising approach to address the rapidly growing computational demands of Deep Neural Networks (DNNs). Mapping DNN layers spatially onto NVM-based IMC accelerators achieves high degrees of parallelism. However, two challenges that arise in this approach are the highly non-uniform distribution of layer processing times and high area requirements. We propose LRMP, a method to jointly apply layer replication and mixed precision quantization to improve the performance of DNNs when mapped to area-constrained IMC accelerators. LRMP uses a combination of reinforcement learning and mixed integer linear programming to search the replication-quantization design space using a model that is closely informed by the target hardware architecture. Across five DNN benchmarks, LRMP achieves 2.6-9.3× latency and 8-18× throughput improvement at minimal (<1%) degradation in accuracy.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators.

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence

Lead the way for us

Similar Papers

What can classic Atari video games tell us about the human brain?
Raphael Köster ... Martin J Chadwick
Neuron | VOL. 109
Raphael Köster, et. al.Raphael Köster ... Martin J Chadwick
01 Feb 2021
Neuron | VOL. 109

IN-MEMORY COMPUTING WITH CMOS AND EMERGING MEMORY TECHNOLOGIES

-

17 Oct 2019
17 Oct 2019

Exploring Methods for Efficient Learning in Neural Networks

-

26 Jul 2021
26 Jul 2021

Computing-In-Memory Neural Network Accelerators for Safety-Critical Systems
Zheyu Yan ... Xiaobo Sharon Hu
-
Zheyu Yan, et. al.Zheyu Yan ... Xiaobo Sharon Hu
30 Oct 2022
30 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators.

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence