Abstract

The recent proposed Deformable Convolutional Networks (DCNs)greatly enhance the performance of conventional Convolutional Neural Networks (CNNs) on vision recognition tasks by allowing flexible input sampling during inference runtime. DCNs introduce an additional convolutional layer for adaptive sampling offset generation, followed by a bilinear interpolation (BLI) algorithm to integerize the generated non-integer offset values. Finally, a regular convolution is performed on the loaded input pixels. Compared with conventional CNNs, DCN demonstrated significantly increased computational complexity and irregular input-dependentmemory access patterns, making it a great challenge for deploying DCNs onto edge devices for real-time computer vision tasks. In this work, we propose RECOIN, a processing-in-memory (PIM) architecture, which supports DCN inference on resistive memory (ReRAM)crossbars, thus making the first DCN inference accelerator possible. We present a novel BLI processing engine that leverage both row-and column-oriented computation for in-situ BLI calculation. Amapping scheme and an address converter are particular designed to accommodate the intensive computation and irregular data access. We implement the DCN inference in a 4-stage pipeline and evaluate the effectiveness of RECOIN on six DCN models. Experimental results show RECOIN achieves respectively 225×and 17.4×improvement in energy efficiency compared to general-purpose CPU and GPU. Compared to two state-of-the-art ASIC accelerators, RECOIN achieve 26.8× and 20.4× speedup respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call