Abstract
Convolutional Neural Networks (CNN) have emerged as the most efficient technique for solving a host of machine learning tasks, especially in image and video processing domains. However deploying CNNs on computing systems with smaller form factors have found to be extremely challenging due to the complex nature of CNNs. Hardware acceleration of CNNs using FPGAs have emerged as a promising approach due to high performance, energy efficiency and reconfigurability of FPGAs. Winograd filtering based convolution is the most efficient algorithm for calculating convolution for smaller filter sizes. In this paper, we propose a unified architecture named UniWiG, where both Winograd based convolution and general matrix multiplication (GEMM) can be accelerated using the same set of processing elements. This enables efficient utilization of FPGA hardware resources for accelerating all the layers in the CNNs. The proposed architecture has been used to accelerate AlexNet CNN, which shows performance improvement in the range of 1.4 to 4.02 with only 13% additional FPGA resources than state-of-art GEMM accelerator. We have also analyzed the performance with varying Winograd tile sizes and found out the most appropriate tile sizes for maximizing the performance while reducing on-chip memory resources.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.