Abstract

The lack of vehicle feature information and the limited number of pixels in high-definition remote-sensing images causes difficulties in vehicle detection. This paper proposes U-YOLO, a vehicle detection method that integrates multi-scale features, attention mechanisms, and sub-pixel convolution. The adaptive fusion module (AF) is added to the backbone of the YOLO detection model to increase the underlying structural information of the feature map. Cross-scale channel attention (CSCA) is introduced to the feature fusion part to obtain the vehicle’s explicit semantic information and further refine the feature map. The sub-pixel convolution module (SC) is used to replace the linear interpolation up-sampling of the original model, and the vehicle target feature map is enlarged to further improve the vehicle detection accuracy. The detection accuracies on the open-source datasets NWPU VHR-10 and DOTA were 91.35% and 71.38%. Compared with the original network model, the detection accuracy on these two datasets was increased by 6.89% and 4.94%, respectively. Compared with the classic target detection networks commonly used in RFBnet, M2det, and SSD300, the average accuracy rate values increased by 6.84%, 6.38%, and 12.41%, respectively. The proposed method effectively solves the problem of low vehicle detection accuracy. It provides an effective basis for promoting the application of high-definition remote-sensing images in traffic target detection and traffic flow parameter detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call