MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip

Lei Gong,Xi Li,Huaping Chen,Xuehai Zhou,Chao Wang

doi:10.1109/tcad.2018.2857078

Abstract

Recently, field-programmable gate arrays (FPGAs) have been widely used in the implementations of hardware accelerator for convolutional neural networks (CNNs). However, most of these existing accelerators are designed in the same idea as their ASIC counterparts, in which all operations from different layers are mapped to the same hardware units and working in a multiplexed way. This manner does not take full advantage of reconfigurability and customizability of FPGAs, resulting in a certain degree of computational efficiency degradation. In this paper, we propose a new architecture for FPGA-based CNN accelerator that maps all the layers to their own on-chip units and working concurrently as a pipeline. A comprehensive mapping and optimizing methodology based on establishing roofline model oriented optimization model is proposed, which can achieve maximum resource utilization as well as optimal computational efficiency. Besides, to ease the programming burden, we propose a design framework which can provide a one-stop function for developers to generate the accelerator with our optimizing methodology. We evaluate our proposal by implementing different modern CNN models on Xilinx Zynq-7020 and Virtex-7 690t FPGA platforms. Experimental results show that our implementations can achieve a peak performance of 910.2 GOPS on Virtex-7 690t, and 36.36 GOP/s/W energy efficiency on Zynq-7020, which are superior to the previous approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Nov 1, 2018
Citations: 114

Similar Papers

An Uninterrupted Processing Technique-Based High-Throughput and Energy-Efficient Hardware Accelerator for Convolutional Neural Networks
Md Najrul Islam ... Rahul Shrestha
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 30
Md Najrul Islam, et. al.Md Najrul Islam ... Rahul Shrestha
01 Dec 2022
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | VOL. 30

A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks (Abstract Only)
Yixing Li ... Kai Xu
-
Yixing Li, et. al.Yixing Li ... Kai Xu
22 Feb 2017
A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks (Abstract Only)
Yixing Li ... Kai Xu

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
Behzad Salami ... Adrian Cristal Kestelman
-
Behzad Salami, et. al.Behzad Salami ... Adrian Cristal Kestelman
01 Jun 2020
01 Jun 2020

A power-efficient and high performance FPGA accelerator for convolutional neural networks
Lei Gong ... Chao Wang
-
Lei Gong, et. al.Lei Gong ... Chao Wang
15 Oct 2017
15 Oct 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems