GANsformer: A Detection Network for Aerial Images with High Performance Combining Convolutional Network and Transformer

Yan Zhang,Shuyu Chen,Xi Liu,Shiyun Wa,Qin Ma

doi:10.3390/rs14040923

Yan Zhang, Shuyu Chen + Show 3 more

Open Access

https://doi.org/10.3390/rs14040923

Copy DOI

Journal: Remote sensing	Publication Date: Feb 14, 2022
Citations: 24	License type: CC BY 4.0

Affiliation: China Agricultural University

Abstract

There has been substantial progress in small object detection in aerial images in recent years, due to the extensive applications and improved performances of convolutional neural networks (CNNs). Typically, traditional machine learning algorithms tend to prioritize inference speed over accuracy. Insufficient samples can cause problems for convolutional neural networks, such as instability, non-convergence, and overfitting. Additionally, detecting aerial images has inherent challenges, such as varying altitudes and illuminance situations, and blurred and dense objects, resulting in low detection accuracy. As a result, this paper adds a transformer backbone attention mechanism as a branch network, using the region-wide feature information. This paper also employs a generative model to expand the input aerial images ahead of the backbone. The respective advantages of the generative model and transformer network are incorporated. On the dataset presented in this study, the model achieves 96.77% precision, 98.83% recall, and 97.91% mAP by adding the Multi-GANs module to the one-stage detection network. These three indices are enhanced by 13.9%, 20.54%, and 10.27%, respectively, when compared to the other detection networks. Furthermore, this study provides an auto-pruning technique that may achieve 32.2 FPS inference speed with a minor performance loss while responding to the real-time detection task’s usage environment. This research also develops a macOS application for the proposed algorithm using Swift development technology.

Highlights

In recent years, there has been significant progress in the development of aerial image detection [1,2,3]
Because GANsformer inherits and combines the structural and global feature extraction advantages of convolutional neural networks (CNNs) and visual transformers, its performance is significantly better than CNN and ViT with comparable parameter complexity
By reducing the number of parameters, improving the training speed, to improve the CNN’s ability to capture global features as a branch network. Because it inherits and combines the structural and global feature extraction advantages of CNN and visual transformers, The performance of GANsformer is significantly better than CNN and vision transformer with comparable parameter complexity, showing the great potential capability in aerial images detection tasks

Summary

Introduction

There has been significant progress in the development of aerial image detection [1,2,3]. Traditional target detection typically employs images acquired on the ground, and its dataset has limitations that constrain the scope of target detection research. There are objective difficulties in acquiring particular images, such as capturing images in extreme geographical locations or images with large objects. Aerial image detection—an improved and prevalent one for object detection studies—optimizes these issues, broadens the research scope of object detection, and makes access to images more flexible and convenient. Thanks to advancements in associated image detection algorithms and approaches in capturing aerial images, strong support for improved aerial image object detection has been provided. Aerial image object detection technologies will become increasingly important in the future

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

GANsformer: A Detection Network for Aerial Images with High Performance Combining Convolutional Network and Transformer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote sensing

Lead the way for us

Similar Papers

Deep learning based multi-category object detection in aerial images
Tobias Schuchert ... Abhijit Mahalanobis
-
Tobias Schuchert, et. al.Tobias Schuchert ... Abhijit Mahalanobis
01 May 2017
01 May 2017

MSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions
Vishnu Chalavadi ... Krishna Mohan C
Pattern Recognition | VOL. 126
Vishnu Chalavadi, et. al.Vishnu Chalavadi ... Krishna Mohan C
23 Jan 2022
Pattern Recognition | VOL. 126

Training lightweight network from scratch for efficient object detection in aerial images
Ang Su ... Jon Atli Benediktsson
-
Ang Su, et. al.Ang Su ... Jon Atli Benediktsson
07 Oct 2019
07 Oct 2019

Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges.
Jian Ding ... Marcello Pelillo
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44
Jian Ding, et. al.Jian Ding ... Marcello Pelillo
01 Nov 2022
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GANsformer: A Detection Network for Aerial Images with High Performance Combining Convolutional Network and Transformer

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote sensing