Abstract

As the core technology of an environmental perception system, object detection has received more and more attention and has become a hot research direction for intelligent driving vehicles. The CNN–Transformer hybrid model has poor generalization ability, making it difficult to meet the detection requirements for small objects in complex scenes. We propose a novel convolutional neural network (CNN)–Transformer Adaptive Feature Fusion Network (CTAFFNet) for object detection. First, we design a Local–Global Feature Fusion unit known as the Convolutional Transformation Adaptive Fusion Kernel (CTAFFK), which is integrated into CTAFFNet. The CTAFFK kernel utilizes two branches, namely CNN and Transformer, to extract local and global features from the image, and adaptively fuses the features from both branches. Subsequently, we develop an adaptive feature fusion strategy that combines local high-frequency and global low-frequency features to obtain comprehensive feature information. Finally, CTAFFNet employs an encoder–decoder structure to facilitate the flow of fused local–global information between different stages, ensuring the model’s generalization capabilities. Results from the experiment conducted on the large and challenging KITTI dataset demonstrate the effectiveness and efficiency of the proposed network. Compared with other mainstream networks, it achieves an average precision of 91.17%, particularly excelling in the detection of small objects at longer distances with a remarkable 70.18% accuracy, while also providing real-time detection capabilities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.