Abstract
In this paper, we designed a new hardware architecture that uses non-blocking network for accelerating the convolutional neural network (CNN). Unlike many other CNN accelerator which only capable of supporting a specific type of network model, by making use of the rearrangeability of non-blocking network, we can provide high flexibility and high parallelism. We successfully implemented our CNN accelerator on Xilinx Virtex UltraScale+ FPGA VCU128 Evaluation Kit and evaluated it by running CNN model, LeNet-5.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have