Abstract

In this brief, an FPGA-based solution is proposed to show the computing efficiency on rotated object detection based on R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> Det algorithm. The key idea of our approach is firstly to design reconfigurable neural processing units (NPU) for convolutional neural networks (CNN) and a specific architecture for spatial operations, and then to adopt a novel scheduling scheme to deal with the data dependency on these modules. When implemented on cost-effective Kintex Ultrascale+ XCKU15P FPGA, the proposed solution achieves nearly 3.20 times energy efficiency compared to NVIDIA V100 GPU. The proposed reconfigurable NPU achieves 73.20% DSP efficiency, 30.39% improvement compared to the 56.14% DSP efficiency of the state-of-the-art CNN accelerator on ResNet-50.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.