Abstract

Convolutional Neural Networks (CNNs) have achieved a state-of-the-art performance in the different computer vision tasks. However, CNN algorithms are computationally and power intensive, which makes them difficult to run on wearable and embedded systems. One way to address this constraint is to reduce the number of computational operations performed. Recently, several approaches addressed the problem of the computational complexity in the CNNs. Most of these methods, however, require a dedicated hardware. We propose a new method for the computation reduction in CNNs that substitutes Multiply and Accumulate (MAC) operations with a codebook lookup and can be executed on the generic hardware. The proposed method called QL-Net combines several concepts: (i) a codebook construction, (ii) a layer-wise retraining strategy, and (iii) a substitution of the MAC operations with the lookup of the convolution responses at inference time. The proposed QL-Net achieves a 98.6% accuracy on the MNIST dataset with a 5.8x reduction in runtime, when compared to MAC-based CNN model that achieved a 99.2% accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call