Abstract

We present <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FastConv</i> , a template-based code auto-generation open-source library that can automatically generate high-performance deep learning convolution kernels of arbitrary matrices/tensors shapes. FastConv is based on the Winograd algorithm, which is reportedly the highest performing algorithm for the time-consuming layers of convolutional neural networks. ARM CPUs cover a wide range of designs and specifications, from embedded devices to HPC-grade CPUs. The leads to the dilemma of how to consistently optimize Winograd-based convolution solvers for convolution layers of different shapes. FastConv addresses this problem by using templates to auto-generate multiple shapes of tuned kernels variants suitable for skinny tall matrices. As a performance portable library, FastConv transparently searches for the best combination of kernel shapes, cache tiles, scheduling of loop orders, packing strategies, access patterns, and online/offline computations. Auto-tuning is used to search the parameter configuration space for the best performance for a given target architecture and problem size. Results show 1.02x to 1.40x, 1.14x to 2.17x, and 1.22x and 2.48x speedup is achieved over NNPACK, ARM NN, and FeatherCNN on Kunpeng 920. Furthermore, performance portability experiments with various convolution shapes show that FastConv achieves 1.2x to 1.7x speedup and 2x to 22x speedup over NNPACK and ARM NN inference engine using Winograd on Kunpeng 920. CPU performance portability evaluation on VGG–16 show an average speedup over NNPACK of 1.42x, 1.21x, 1.26x, 1.37x, 2.26x, and 11.02x on Kunpeng 920, Snapdragon 835, 855, 888, Apple M1, and AWS Graviton2, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.