Abstract

Nowadays, Unmanned Aerial Vehicles (UAVs) have become useful for various civil applications, such as traffic monitoring and smart parkings, where real-time vehicle detection and classification is one of the key tasks. There are many challenges in detecting vehicles including small size objects and the variety in the UAV’s altitude and angle. As classic object detection solutions have limitations in confronting these challenges, recent methods are developed based on convolutional neural networks and their ability in effective feature learning. Due to the computational complexity in these networks and the need for accurate and real-time object detection, balancing the accuracy and inference speed is obligatory for efficiency. This paper aims to propose an accurate, efficient and real-time vehicle detection network based on the successful YOLOv5 object detection model. This is done by improving the structure of the model, adding attention mechanism and using an adaptive bounding box regression loss function. Also, considering the need for real-time inference speed, the depth and width of the model was balanced and ghost convolution was incorporated into the Neck unit to further improve the balance between accuracy and inference speed. The proposed method is evaluated on three different urban UAV imagery datasets, VisDrone, CARPK and VAID, specifically intended for civil applications. Comparing the obtained results from the proposed method with YOLOv5 baseline models, it achieved 3.52% higher mAP50 and 207.15% higher FPS than YOLOv5X on VisDrone dataset, while it is much smaller in size and GFLOPS. Totally, the proposed network outcomes show how the applied structural and conceptual modifications can upgrade the YOLO family towards being small in size, high in accuracy and fast in inference speed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call