Abstract

Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is used in many DNN software frameworks. However, the performance gain is at the expense of a reduction in floating point (FP) numerical accuracy. In this article, we analyse the worst-case FP error and derive an estimation of the norm and conditioning of the algorithm. We show that the bound grows exponentially with the size of the convolution. Further, the error bound of the modified algorithm is slightly lower but still exponential. We propose several methods for reducing FP error. We propose a canonical evaluation ordering based on Huffman coding that reduces summation error. We study the selection of sampling “points” experimentally and find empirically good points for the most important sizes. We identify the main factors associated with good points. In addition, we explore other methods to reduce FP error, including mixed-precision convolution, and pairwise summation across DNN channels. Using our methods, we can significantly reduce FP error for a given block size, which allows larger block sizes to be used and reduced computation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call