ABSTRACT In the field of image object detection and semantic segmentation, improving the accuracy of object identification and segmentation is a primary goal. To achieve this, leveraging the potential of multi-scale information through feature map refinement and fusion has been widely recognized. However, existing feature fusion methods either design more complex feature pyramid networks, replace existing detectors, or incrementally introduce feature fusion modules, overlooking the effective approach of enhancing spatial information in deep feature maps. We propose a novel pluggable feature fusion paradigm termed ‘Effective Learning Bridge’. Our research introduces an efficient and adaptive learning mechanism that builds learning bridges between feature maps at different scales within the feature pyramid, thereby enhancing the spatial information of objects in deep feature maps. This mechanism is specifically designed for multi-scale feature maps and can be seamlessly integrated into any network incorporating feature maps. By altering the model’s backpropagation path, we successfully improve learning efficiency, which in turn enhances the accuracy of object detection and segmentation. Our proposed paradigm and method were extensively evaluated through experiments on SIMD, HRSID, and WHDLD datasets and benchmark models. The results unequivocally demonstrate the effectiveness of our approach in significantly improving the accuracy of object detection and semantic segmentation, as well as the overall learning efficiency of the model.
Read full abstract