Abstract

Although the convolutional neural network (CNN) has exhibited outstanding performance in various applications, the deployment of CNN on embedded and mobile devices is limited by the massive computations and memory footprint. To address these challenges, Courbariaux and co-workers put forward binarized neural network (BNN) which quantizes both the weights and activations to [Formula: see text]1. From the perspective of hardware, BNN can greatly simplify the computation and reduce the storage. In this work, we first present the algorithm optimizations to further binarize the first layer and the padding bits of BNN; then we propose a fully binarized CNN accelerator. With the Shuffle–Compute structure and the memory-aware computation schedule scheme, the proposed design can boost the performance for feature maps of different sizes and make full use of the memory bandwidth. To evaluate our design, we implement the accelerator on the Zynq ZC702 board, and the experiments on the SVHN and CIFAR-10 datasets show the state-of-the-art performance efficiency and resource efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call