Abstract

AbstractTo tackle challenges in road multi‐object detection, such as object occlusion, small object detection, and multi‐scale object detection difficulties, a new YOLOv8n‐RSFM structure is proposed. The key improvement of this structure lies in the introduction of the transformer decoder head, which optimizes the matching between the ground truth and predicted boxes, thereby effectively addressing issues of object overlap and multi‐scale detection. Additionally, a small object detection layer is incorporated to retain crucial information beneficial for detecting small objects, significantly improving the detection accuracy for small targets. To enhance learning capacity and reduce redundant computations, the FasterNet backbone is employed to replace CSPDarknet53, thus accelerating the training process. Finally, the INNER‐MPDIoU loss function is introduced to replace the original algorithm's complete IoU to accelerate the convergence and obtain more accurate regression results. A series of experiments were conducted on different datasets. The experimental results show that the proposed model YOLOv8N‐RSFM outperforms the original model YOLOv8n in small target detection. On the VisDrone, TinyPerson, and VSCrowd datasets, the mean accuracy percentage improved by 7.9%, 12.3%, and 4.5%, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.