Memory access optimized routing scheme for deep networks on a mobile coprocessor

Aysegul Dundar,Vinayak Gokhale,Berin Martini,Eugenio Culurciello,Jonghoon Jin

doi:10.1109/hpec.2014.7040963

Abstract

In this paper, we present a memory access optimized routing scheme for a hardware accelerated real-time implementation of deep convolutional neural networks (DCNNs) on a mobile platform. DCNNs consist of multiple layers of 3D convolutions, each comprising between tens and hundreds of filters and they generate the most expensive operations in DCNNs. Systems that run DCNNs need to pass 3D input maps to the hardware accelerators for convolutions and they face the limitation of streaming data in and out of the hardware accelerator. The bandwidth limited systems require data reuse to utilize computational resources efficiently. We propose a new routing scheme for 3D convolutions by taking advantage of the characteristic of DCNNs to fully utilize all the resources in the hardware accelerator. This routing scheme is implemented on the Xilinx Zynq-7000 All Programmable SoC. The system fully explores weight level and node level parallelization of DCNNs and achieves a peak performance 2x better than the previous routing scheme while running DCNNs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memory access optimized routing scheme for deep networks on a mobile coprocessor

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Aspects of programming for implementation of convolutional neural networks on multisystem HPC architectures
Sunil Pandey ... Shrish Verma
Journal of Physics: Conference Series | VOL. 2062
Sunil Pandey, et. al.Sunil Pandey ... Shrish Verma
01 Nov 2021
Journal of Physics: Conference Series | VOL. 2062

Coarse-Grained High-speed Reconfigurable Array-based Approximate Accelerator for Deep Learning Applications
Katherine Mercado ... Sathwika Bavikadi
-
Katherine Mercado, et. al.Katherine Mercado ... Sathwika Bavikadi
22 Mar 2023
22 Mar 2023

ICNN: An iterative implementation of convolutional neural networks to enable energy and computational complexity aware dynamic approximation
Katayoun Neshatpour ... Avesta Sasan
-
Katayoun Neshatpour, et. al.Katayoun Neshatpour ... Avesta Sasan
01 Mar 2018
01 Mar 2018

Application of bit-serial arithmetic units for FPGA implementation of convolutional neural networks
G Csordas ... B Feher
-
G Csordas, et. al.G Csordas ... B Feher
01 May 2018
01 May 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memory access optimized routing scheme for deep networks on a mobile coprocessor

Abstract

Talk to us

Similar Papers