Optimizing YOLOv8 for Real-Time CCTV Surveillance: A Trade-off Between Speed and Accuracy

Muhammad Rizqi Sholahuddin,Fachri Dhia Fauzan,Iwan Awaludin,Bima Putra Sudimulya,Maisevli Harika,Yunita Citra Dewi,Vandha Pradiyasma Widarta

doi:10.15575/join.v8i2.1196

Abstract

Real-time video surveillance, especially CCTV systems, requires fast and accurate face detection. Object detection models with slow inference times are ineffective in real-time. This study addresses this challenge by improving the inference speed of the YOLOv8 model, a leading object detection framework known for its accuracy and speed. We focus on pruning the model's architecture, particularly the P5 head section, which detects larger objects. According to Bochkovskiy's 2020 research, this modification enhances the model's performance specifically for medium and small objects in CCTV footage. The standard YOLOv8 model and its modified version were compared for inference time, mean Average Precision (mAP), and model weight. The pruned YOLOv8 model cuts inference time by 15.56%, from 4.5 ms to 3.8 ms, and reduces model weight. The advantages mentioned above are offset by a 1.6% decrease in mean average precision. This research advances object detection technology by demonstrating architectural modifications' efficacy. These changes make the model faster and lighter, making it suitable for real-time surveillance. The accuracy trade-off is slight. The implications of these findings are crucial for implementing efficient object detection systems in CCTV surveillance. These findings also lay the groundwork for future research to improve such systems' speed-accuracy trade-off.

Full Text