Application of bit-serial arithmetic units for FPGA implementation of convolutional neural networks

G Csordas,T Kovacshazy,B Feher

doi:10.1109/carpathiancc.2018.8399649

Abstract

Convolutional Neural Networks (CNN) are commonly used in machine vision applications, for example in Cyber-Physical Systems (CPS), where real-time processing of incoming video feeds is a typical task. Therefore, execution (inference) of CNNs on the incoming stream of pictures (video feeds) must be done with strict application specific deadlines using the available systems resources. The paper is going to present a bit-serial implementation of CNN for inference type applications for FPGAs. The implementation utilizes the fact that the CNN is fully defined, i.e., all layers, coefficients, activation functions, etc. are known in design time for the CNN to be implemented. Therefore, the coefficients of the CNN can be embedded into the bit-serial multipliers and distributed arithmetic can be used significantly reducing the FPGA resources used by the implementation. In addition, the developed implementation uses pipelining to increase performance. The current implementation of the CNN convolutional layer is generated using Python from a high level description of the CNN, i.e., the Python code reads the description, and generates Verilog code, then the resulting Verilog code can be used to synthesize the specified CNN on the target FPGA by standard FPGA design tools. The final paper also demonstrate the implementation on a Xilinx Zynq Z-7020 FPGA.

Full Text