GCONV Chain: Optimizing the Whole-Life Cost in End-to-end CNN Acceleration

Jiaqi Zhang,Sandip Ray,Xiangru Chen

doi:10.1109/tc.2021.3128159

Abstract

The acceleration of CNNs has gained increasing attention since their success in computer vision. Since the heterogeneous layers cannot be processed by accelerators proposed for convolution layers only, modern end-to-end CNN acceleration solutions either transform diverse computation into matrix/vector arithmetic, which loses data reuse opportunities in convolution, or introduce dedicated functional unit to each kind of layer, which results in underutilization and high update expenses. To enhance the whole-life cost efficiency, we need a solution that is efficient in processing CNN layers and has the generality to apply to all kinds of existing and emerging layers. To this end, we propose GCONV Chain, a method to convert the entire CNN computation into a chain of standard general convolutions (GCONV) that can be efficiently processed by existing CNN accelerators with low-overhead hardware support. This paper comprehensively analyzes the GCONV Chain model and proposes a full-stack implementation to support GCONV Chain. Our results on various CNNs demonstrate that GCONV Chain improves the performance and energy efficiency of existing CNN accelerators by an average of 3.4x and 3.2x respectively. Furthermore, we show that GCONV Chain provides low whole-life costs for CNN acceleration, including both developer efforts and total cost of ownership.

Full Text