AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators

Niko Zurstraßen,Tim Kogel,Holger Keding,Lukas Jünger,Rainer Leupers

doi:10.1007/s10766-022-00728-3

Abstract

In recent years the growing popularity of Convolutional Neural Network(CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerator (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN execution time on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. By refining the model following a divide-and-conquer paradigm, AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy 98%. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel Programming

Lead the way for us

Journal: International Journal of Parallel Programming	Publication Date: Mar 24, 2022
License type: open-access

Similar Papers

AMAIX: A Generic Analytical Model for Deep Learning Accelerators
Lukas Jünger ... Tim Kogel
-
Lukas Jünger, et. al.Lukas Jünger ... Tim Kogel
01 Jan 2020
01 Jan 2020

The Implementation of LeNet-5 with NVDLA on RISC-V SoC
Shanggong Feng ... Shengang Zhou
-
Shanggong Feng, et. al.Shanggong Feng ... Shengang Zhou
01 Oct 2019
01 Oct 2019

Deep learning accelerators: a case study with MAESTRO
Hamidreza Bolhasani ... Somayyeh Jafarali Jassbi
Journal of Big Data | VOL. 7
Hamidreza Bolhasani, et. al.Hamidreza Bolhasani ... Somayyeh Jafarali Jassbi
12 Nov 2020
Journal of Big Data | VOL. 7

An FPGA-based accelerator platform implements for convolutional neural network
Xiao Meng ... Zhiyong Qin
-
Xiao Meng, et. al.Xiao Meng ... Zhiyong Qin
08 Mar 2019
08 Mar 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel Programming