Abstract
Target detection in unmanned aerial vehicle application scenarios has other problems, such as dense targets. The existing unmanned aerial vehicle target detection model with high computational complexity makes it difficult to meet real-time unmanned aerial vehicle target detection, and the detection accuracy of small targets is low. To address these problems, we propose an improved YOLOv7 small target detection model based on context and pyramidal attention that can cope with dense unmanned aerial vehicle scenarios - CPA-YOLOv7. This model embeds our proposed lightweight multi-scale attentional feature spatial pyramid pooling module, which can better distinguish between small and large target features, reducing the computational effort while improving the detection accuracy of the model. Secondly, we design a contextual dynamic fusion attention module in the network to fuse global and local contextual information and dynamically assign features to multiple groups of channels; in the multi-scale fusion process, it effectively increases the characterization ability of small target features and enables the network to better focus on small target information. Finally, we improve Wise-Intersection-over-Union loss as the regression loss function, add a moderation factor to retain some of the high and low-quality sample weights to improve the regression accuracy of high-quality anchor frames, and use the dynamic non-monotonic focusing mechanism to increase the model's focus on ordinary quality anchor frames to improve the model's localization performance and robustness to low-quality samples. Numerous experimental results show that on the unmanned aerial vehicle datasets VisDrone2021-DET and AI-TOD, the mAP values of our model are 2.3% and 1.1% higher than those of the YOLOv7 model with fewer parameters introduced, and the computational speed reaches 146 frames per second (FPS), which can meet the real-time requirements of unmanned aerial vehicle detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Visual Communication and Image Representation
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.