Abstract

Although High-Level Synthesis (HLS) tools have been in the scene for almost fifteen years, researchers have been reluctant to use them for accelerating their algorithms on FPGA SoCs. We present CNN-Grinder, a template-driven workflow for converting algorithmic descriptions of mobile-friendly convolutional neural networks (CNNs), such as SqueezeNet v1.1 and ZynqNet, to HLS code which can be used for programming low-end-low-cost FPGA SoCs. In contrast to other works, which from the user perspective are acting as a black box, CNN-Grinder does not hide its inner workings by automating the procedure of algorithmic-to-HLS description but it exposes every step in a clear and concise way. CNN-Grinder provides the means to developers to map a CNN on an FPGA SoC by providing easy to follow steps and templates which are not constrained to specific CNN architectures and FPGA devices. Our workflow is accompanied by the SqueezeJet-2 accelerator, which is used for the acceleration of the convolutional and the max-pooling layers of the SqueezeNet v1.1 and the ZynqNet CNNs making possible to achieve more than 10fps CNN inference at 100MHz using a batch size equal to 1 on a low-end-low-cost FPGA SoC such as the Xilinx XC7Z020. Finally, an analytical model of the SqueezeJet-2 accelerator is developed and evaluated against related results produced by the Xilinx Vivado HLS tool.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call