Abstract

In the last years, Convolutional Neural networks (CNNs) found applications in many fields from computer vision to speech recognition, showing outstanding results in terms of accuracy. Field Programmable Gate Arrays (FPGAs) proved to be a promising platform for running CNN algorithms because they offer a remarkable trade-off between power consumption and computational power. However, an efficient implementation of CNN models on-board an FPGA represents a complex task since CNN massive parallel processing is often limited by FPGA storage capabilities and design congestion. This article introduces MEM-OPT, a scheduling algorithm and data re-use system that aims to optimize on-chip memory usage on-board FPGAs for what concerns input feature maps storage and Processing Elements multiply and accumulation process. The work presents MEM-OPT implementations results on a Xilinx XC7Z020, including hardware resources, maximum clock frequency and power consumption. MEM-OPT memory requirements are analyzed for LeNet-5, MobileNet, VGG-16 and other state-of-the-art CNNs, showing, a reduction up to 80% of the overall on-chip memory necessary for storing input feature maps and accumulating output results with respect to alternative solutions available in the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call