Abstract
Previous deep convolutional neural network research has made significant progress toward improving the speed and accuracy of object detection. However, despite these advancements, the inaccurate detection of multi-object (small objects) remains challenging in the traffic environments. In this paper, we propose a new architecture called YOLOM, which is specifically designed to achieve enhanced multi-object (small objects) detection precision. YOLOM incorporates several innovative features: a multi-spatial pyramid (MSP), an optimized focal loss (OFLoss) function, and an objectness loss that incorporates effective intersection over union (EIoU) calculations. These features collectively yield enhanced accuracy and reduce the miss rate of small objects, particularly in the multi-object cases. According to the sizes of receptive field features with different spatial scales with pooling layers, we propose the MSP module. We optimize the focal loss as a classification function instead of the cross-entropy loss, which solves some class imbalance problems caused by anchor-free detection when encountering disparate datasets. Due to the superior performance of EIoU in confidence scoring, we use EIoU to participate in the objectness loss calculation of our work. Therefore, our method substitutes EIoU for YOLOX's objectness loss. The experimental results demonstrate that our strategies significantly outperform some end-to-end object detection methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have