Abstract
This brief presents a novel architecture to implement a resource-efficient inference accelerator for binary convolutional neural networks (BCNN). The proposed architecture consistently processes each constituent block of a network in an output-oriented manner. It skips the redundant operations that are involved with the elements within a pooling window after the pooling result is determined as well as the operations with respect to the padded zeros. A BCNN inference accelerator has been implemented based on the proposed architecture using an FPGA. The resource efficiency is as high as 41.45M-OP/s/LUT in the CIFAR-10 classification task. The functionality of the proposed accelerator has been verified by implementing a fully-integrated BCNN inference system including an MCU.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have