Abstract

Object detection includes three subtasks of predicting target position, classification, and confidence. In the mainstream object detection model, the model pursues internal structure refinement, and each subtask shares almost the same structure, which is a task-coupled structure. The task-coupled structure of the model reduces the training parameters, but it cannot be tuned on the network structure for each task separately, which can limit the model performance. We designed a task decoupled object detection network (YOLOD) based on YOLOv5, where YOLOD is decoupled immediately after the backbone network. By observing the loss convergence of each subtask, three network structures are designed separately and the branch size is controlled so that the model has fewer training parameters. At the same time, some experimental adjustments were made to YOLOD to accelerate the convergence speeds of the model. In addition, we add image contour information to the original three-channel image to assist model training and improve detection accuracy. The experiments demonstrate that the modified model is smaller in size and has the largest accuracy improvement on the small-scale model. without introducing any attention-based modules, YOLOD-S achieves a mAP improvement of 1.1% on the MS COCO dataset and 2.29% on the VOC dataset, and the larger model YOLOD-L achieves an accuracy of 48.8% on the COCO dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.