Abstract

In the field of computer vision, the detection of multiple objects with different scales within a single image is challenging. To target this problem, feature pyramids are a basic component commonly found in multi-scale object detectors. In the construction of standard feature pyramids, different semantic features are simply connected to rebuild a new feature map, regardless of whether these features have a positive effect to the output or not. In order to avoid introducing too many redundant features within the feature fusion stage, a new feature fusion module called the Feature Selection Module (FSM) was proposed in this paper, which can automatically detect the most representative features for the rebuilding of feature maps. The channel attention mechanism in FSM is able to process and score each channel, filtering out irrelevant features while focusing on features with high contribution. Moreover, FSM can be easily embedded within feature pyramids. Simply adding a small number of trainable parameters to the network can significantly improve the ability of feature extraction. We validated our FSM with the VOC 2007 object detection dataset, based on Yolo series detectors. Findings from the present study demonstrates that for a small computational cost, our method is able to consistently improve the performance of Yolo detectors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call