Abstract
To further improve the speed and accuracy of object detection, especially small targets and occluded objects, a novel and efficient detector named YOLO-ACN is presented. The detector model is inspired by the high detection accuracy and speed of YOLOv3, and it is improved by the addition of an attention mechanism, a CIoU (complete intersection over union) loss function, Soft-NMS (non-maximum suppression), and depthwise separable convolution. First, the attention mechanism is introduced in the channel and spatial dimensions in each residual block to focus on small targets. Second, CIoU loss is adopted to achieve accurate bounding box (BBox) regression. Besides, to filter out a more accurate BBox and avoid deleting occluded objects in dense images, the CIoU is applied in the Soft-NMS, and the Gaussian model in the Soft-NMS is employed to suppress the surrounding BBox. Third, to significantly reduce the parameters and improve the detection speed, standard convolution is replaced by depthwise separable convolution, and hard-swish activation function is utilized in deeper layers. On the MS COCO dataset and infrared pedestrian dataset KAIST, the quantitative experimental results show that compared with other state-of-the-art models, the proposed YOLO-ACN has high accuracy and speed in detecting small targets and occluded objects. YOLO-ACN reaches a mAP50 (mean average precision) of 53.8% and an APs (average precision for small objects) of 18.2% at a real-time speed of 22 ms on the MS COCO dataset, and the mAP for a single class on the KAIST dataset even reaches over 80% on an NVIDIA Tesla K40.
Highlights
Object detection utilizes computers and related algorithms to find objects of certain target classes with precise localization [1]
From the average detection accuracy, the attention mechanism affects the detection accuracy of the model, which increases from 51.3% to 55.7%, an increase of 4.4%, so compared with the CIoU and Soft-NMS algorithm which improves the model by about 1%, the attention mechanism has a major impact on the precision improvement of model detection
A one-stage detection model YOLO-ACN is proposed by developing a lightweight network with the attention mechanism, improving the measurement of bounding box (BBox), introducing the CIoU loss function, and optimizing the Soft-NMS
Summary
Object detection utilizes computers and related algorithms to find objects of certain target classes with precise localization [1]. Real-time and accurate object detection can provide good conditions for object tracking, behavior recognition, scene understanding, and medical detection. Significant improvements have been made in object detection by using traditional and deep learning methodologies. Few studies have focused on detecting small targets and occluded objects. The detection accuracy and speed still need to be further improved [2]. Small targets and occluded objects have a few effective pixels, carry only several and incomplete features and are largely submerged in noise and background clutter. After multiple downsample and pooling operations, considerable
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have