The accuracy of current deep learning algorithms has certainly increased. However, deploying deep learning networks on edge devices with limited resources is challenging due to their inherent depth and high parameter count. Here, we proposed an improved YOLO model based on an attention mechanism and receptive field (RFA-YOLO) model, applying the MobileNeXt network as the backbone to reduce parameters and complexity, adopting the Receptive Field Block (RFB) and Efficient Channel Attention (ECA) modules to improve the detection accuracy of multi-scale and small objects. Meanwhile, an FPGA-based model deployment solution was proposed to implement parallel acceleration and low-power deployment of the detection algorithm model, which achieved real-time object detection for optical remote sensing images. We implement the proposed DPU and Vitis AI-based object detection algorithms with FPGA deployment to achieve low power consumption and real-time performance requirements. Experimental results on DIOR dataset demonstrate the effectiveness and superiority of our RFA-YOLO model for object detection algorithms. Moreover, to evaluate the performance of the proposed hardware implementation, it was implemented on a Xilinx ZCU104 board. Results of the experiments for hardware and software simulation show that our DPU-based hardware implementation are more power efficient than central processing units (CPUs) and graphics processing units (GPUs), and have the potential to be applied to onboard processing systems with limited resources and power consumption.
Read full abstract