Olympus: Reaching Memory-Optimality on DNN Processors

Xuyi Cai,Kaijie Tu,Ying Wang,Lei Zhang,Chengsi Gao

doi:10.1109/tc.2021.3112262

Abstract

In DNN processors, main memory consumes much more energy than arithmetic operations. Therefore, many memory-oriented network scheduling (MONS) techniques are introduced to exploit on-chip data reuse opportunities and reduce accesses to memory. However, to derive the theoretical lower bound of memory overhead for DNNs is still a significant challenge, which also sheds light on how to reach memory-level optimality by means of network scheduling. Prior work on MONS mainly focused on disparate optimization techniques or missed some of the data reusing opportunities in diverse network models, thus their results are likely to deviate from the true memory-optimality that can be achieved in processors. This paper introduces Olympus, which comprehensively considers the entire memory-level DNN scheduling space, formally analyzes the true memory-optimality and also how to reach the memory-optimal schedules for an arbitrary DNN running on a DNN processor. The key idea behind Olympus is to derive a true memory lower-bound regarding both the intra-layer and inter-layer reuse opportunities, which has not been simultaneously explored by prior works. Evaluation on SOTA DNN processors of different architectures shows that Olympus can guarantee the minimum off-chip memory access, and it reduces 12.3-85.6% DRAM access and saves 7.4-70.3% energy on the latest network models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Olympus: Reaching Memory-Optimality on DNN Processors

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Jan 1, 2021
Citations: 1

Similar Papers

Segmentation of Diabetic Retinopathy Lesions: The Common Fallacy and Evaluation of Real Segmenters
Pedro Furtado
Open Journal of Medical Imaging | VOL. 10
Pedro FurtadoPedro Furtado
01 Jan 2020
Open Journal of Medical Imaging | VOL. 10

Revisiting Asynchronous Rumor Spreading in the Blockchain Era
Christos Patsonakis ... Mema Roussopoulos
-
Christos Patsonakis, et. al.Christos Patsonakis ... Mema Roussopoulos
01 Dec 2019
01 Dec 2019

Privacy Adversarial Network
Sicong Liu ... Junzhao Du
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 3
Sicong Liu, et. al.Sicong Liu ... Junzhao Du
11 Dec 2019
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 3

On stabbing queries for generalized longest repeat
Bojian Xu
-
Bojian XuBojian Xu
01 Nov 2015
01 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Olympus: Reaching Memory-Optimality on DNN Processors

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers