A Versatile Systolic Array for Transposed and Dilated Convolution on FPGA

Jason Cong,Jie Wang,Suhail Basalama,Atefeh Sohrabizadeh

doi:10.1109/fccm53951.2022.9786198

Abstract

Many modern CNNs feature complex architecture topologies with different layer types. One of these special layers is a fractionally-strided or transposed convolution (T-CONV) layer [1] , which is an up-sampling layer that uses trained weights to produce enlarged high-resolution feature maps. An atrous or dilated convolution (D-CONV) layer is another special layer that maintains the resolution and coverage of feature maps by expanding the receptive fields of convolution filters as discussed in [2] . Both T-CONV and D-CONV layers can be naïvely implemented as normal convolution (N-CONV) layers by inserting S ′ − 1 zeros between adjacent pixels of the input feature maps (FMs) for T-CONV or d − 1 zeros between adjacent values of the filters for D-CONV, where S ′ is T-CONV stride and d is D-CONV dilation rate. This approach, however, leads to a huge underutilization of computation resources due to the introduced zero MAC operations.

Full Text