Abstract

Convolution is one of the most critical operations in various application domains and its computation should combine high performance with energy efficiency. This requirement is critical both for standard convolution and for its other spatial variants, such as dilated, strided, or transposed convolutions. In this work, we focus on the design of a streaming convolution engine, called LazyDCstream, that is tuned for dilated convolution. LazyDCstream utilizes a sliding-window architecture for input data reuse and leverages the already-known decomposition of dilated convolution to: (a) maximize window buffer sharing and (b) enable “lazy” data movement that keeps data transfers per clock cycle as few as possible, and, most importantly, independent of the dilation rate. These two architectural features reduce the power consumption relative to efficient streaming convolution engines without introducing any throughput or area penalty.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call