The domain of object detection was revolutionized with the introduction of Convolutional Neural Networks (CNNs) in the field of computer vision. This article aims to explore the architectural intricacies, methodological differences, and performance characteristics of three CNN-based object detection algorithms, namely Faster Region-Based Convolutional Network (R-CNN), You Only Look Once v3 (YOLO), and Single Shot MultiBox Detector (SSD) in the specific domain application of vehicle detection. The findings of this study indicate that the SSD object detection algorithm outperforms the other approaches in terms of both performance and processing speeds. The Faster R-CNN approach detected objects in images with an average speed of 5.1 s, achieving a mean average precision of 0.76 and an average loss of 0.467. YOLO v3 detected objects with an average speed of 1.16 s, achieving a mean average precision of 0.81 with an average loss of 1.183. In contrast, SSD detected objects with an average speed of 0.5 s, exhibiting the highest mean average precision of 0.92 despite having a higher average loss of 2.625. Notably, all three object detectors achieved an accuracy exceeding 99%.
Read full abstract