Abstract

This brief presents a novel architecture to implement a resource-efficient inference accelerator for binary convolutional neural networks (BCNN). The proposed architecture consistently processes each constituent block of a network in an output-oriented manner. It skips the redundant operations that are involved with the elements within a pooling window after the pooling result is determined as well as the operations with respect to the padded zeros. A BCNN inference accelerator has been implemented based on the proposed architecture using an FPGA. The resource efficiency is as high as 41.45M-OP/s/LUT in the CIFAR-10 classification task. The functionality of the proposed accelerator has been verified by implementing a fully-integrated BCNN inference system including an MCU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call