Abstract. Target detection is a vital field within computer vision, playing an essential role in applications. This paper investigates the advancements and efficiencies of contemporary target detection methodologies, focusing on deep learning frameworks. Through a systematic review and evaluation of these models' architectures and performances using the Common Objects in Context (COCO) dataset, the study highlights their operational effectiveness and practical implications in real-world scenarios. A detailed comparative analysis is conducted, assessing the models based on mean Average Precision (mAP) and input dimensions to determine their suitability across various detection tasks. You Only Look Once version 3 (YOLOv3), in particular, is recognized for its ability to combine high-speed detection with significant accuracy, effectively addressing real-time processing challenges. The results confirm that YOLOv3, despite its smaller input size, performs comparably to more complex systems, demonstrating notable enhancements in design efficiency and processing speed. This research underscores the potential for future optimizations in model architectures to bridge the gap between high-speed and high-accuracy detection tasks, potentially transforming real-time applications across multiple sectors. The practical significance of this work lies in its ability to guide future developments in object detection, benefiting both academic research and industrial applications.
Read full abstract