FFR-SSD: feature fusion and reconstruction single shot detector for multi-scale object detection

Xu Cheng,Zhixiang Wang,Chen Song,Zitong Yu

doi:10.1007/s11760-023-02536-9

Abstract

Object detection is one of the most fundamental tasks toward image content understanding. Although numerous algorithms have been proposed, implementing effective and efficient object detection is still very challenging for now. To solve the challenges of small and multi-scale objects, we propose a hierarchical feature fusion and reconstruction method called feature fusion and reconstruction single shot detector (FFR-SSD). We first present a multi-scale visual attention model, which incorporates the channel-wise and space-wise information into the multi-branch feature enhancement module to improve the feature representation capacities. Second, a hierarchical feature map weighing mechanism is developed to fuse multi-layer feature maps, which contributes to describing the intact objects for the subsequent module. Third, we present an effective feature map reconstruction module to encourage the model to focus on pivotal information. Therefore, the overall contour information is preserved for the shallow enhanced response map and the semantic feature information is enriched for the deep enhanced feature map. Numerous experiments on two public benchmark datasets show that the proposed method achieves significant improvements over the state of the arts.

Full Text