Abstract

With the development of deep learning technology, the research on convolutional neural network-based object detection is becoming more and more mature. However, most methods are unsatisfactory in dealing with the issue of semantic and spatial information imbalance. In this article, we extend the single-shot multibox detector SSD and propose a self-learning multi-scale object detection network by balancing the semantic information and spatial information, named SLMS-SSD. We first construct a shallow feature enhancement module to enhance the representation of small objects by extracting richer context information. Second, in terms of feature connectivity, we design a multi-scale feature selection module for intermediate layer features with a combination of top-down and direct up-samplings. Finally, in terms of feature strength, we design a self-learning feature fusion module for measuring the feature importance. We validate our model on the PASCAL VOC and MS COCO datasets, and the results demonstrate that it can effectively improve the accuracy of object detection, especially the accuracy of small object detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call