Abstract

Convolutional Neural Networks (CNNs) revolutionized computer vision and reached the state-of-the-art performance for image processing, object recognition, and video classification. Even though CNN inference is notoriously compute-intensive, as convolutions account for >90% of the total operation tasks, the ability to tradeoff between accuracy, performance, power, and latency to meet target application makes it an open research topic. This paper proposes the Spatial Locality Input Data (SLID) method for computational reuse during the inference stage for a pre-trained network. The method exploits input data spatial locality via skipping partial processing of the multiply-and-accumulate (MAC) operations for adjacent data and equating its value to previously computed ones. SLID improves the throughput of resource-constrained devices (Internet-of-Things, edge devices) and accelerates computations during the inference phase by reducing the number of MAC operations. Such approximate computing schema does not require a similarity quantification step nor any modification for the training stage. The computational data reuse was evaluated on three well-known distinctive CNN structures and data sets with alternating layer selections: LeNet, CIFAR-10, and AlexNet. The computational data reuse method saves up to 34.9%, 49.84%, and 31.5% of MAC operations while reducing the accuracy by 8%, 3.7%, and 5.0% for the three models mentioned earlier, respectively. Besides, the proposed method saves on memory access by eliminating data fetching of skipped inputs. Furthermore, filter size, strides, and padding on the accuracy and savings of operations are analyzed. SLID is the first work to exploit the input spatial locality for savings on CNN convolution operations with minimal accuracy loss and without memory or computational overhead. This makes it a great option to support intelligence at the edge.

Highlights

  • Contemporary computing solutions are evolving toward artificial intelligence (AI) and massive data analysis algorithms

  • The computational reuse was tested on selected layers alone and on multiple layers collectively to examine the savings of MAC operations with the accuracy loss

  • After running the pretrained models using MATLAB wrapper, the output of each convolutional layer, which acts as an input for the one, was modified according to the proposed scheme and injected again in the network forward pass

Read more

Summary

INTRODUCTION

Contemporary computing solutions are evolving toward artificial intelligence (AI) and massive data analysis algorithms. F. Alantali et al.: SLID: Exploiting Spatial Locality in Input Data as a Computational Reuse Method for Efficient CNN. Deeper convolutional layers provide a better performance, aggressively reducing the dimensions leads to information loss and affects the system’s performance. Compression techniques such as pruning and quantization might require re-training for fine-tuning the weights to increase accuracy after pruning. The approach is to selectively substitute previous computation results to forthcoming ones to decrease CNN’s computation intensity without the need for additional processing The method supports both the traditional computing platforms and emerging ones, such as the RRAM crossbar array.

RELATED WORK ON EFFICIENT CNN IMPLEMENTATIONS
SIMULATION RESULTS OF SLID
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.