Abstract

Three-dimensional convolutional neural networks (3D CNNs) have been shown to be excellent for classification in human action recognition, hyperspectral image classification and many other aspects. However, 3D CNNs often contain convolution kernels of different sizes and a large number of operations, and the numerous operations make 3D CNNs consume a lot of time when executing inference on resource-constrained terminals. Aiming at these two points, this paper proposes a 3D CNN accelerator design based on loop optimization and weight reuse. The design adopts loop tiling, loop unrolling and loop fusion to optimize the direct convolution form of the 3D convolution operation; and guided by the optimized form of the 3D convolution operation, a module that can accelerate different 3D convolution operations is designed; and a memory access method based on weight reuse is also designed to reduce the memory access time of features and weights. Experimental results show that the proposed accelerator not only supports convolution kernels of different sizes, but also has a performance density of 1.59, which outperforms most existing accelerators.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.