A variety of deep learning approaches have been applied to region of interest (ROI) extraction, which is a fundamental task in the field of remote sensing image (RSI) processing. However, the unbalanced distribution of positive and negative samples in most RSIs greatly restricts the performance of these deep learning-based methods. In this study, a data augmentation method based on variational autoencoder-multiscale generative adversarial network (VAE-MSGAN) with spatial and channelwise attention (SCA) is proposed to balance the sample distribution and improve the subsequent ROI extraction results. First, we combine the original multispectral information with handcrafted texture features to make full use of the low-level visual features of RSIs. We then design a VAE-MSGAN to generate realistic RSIs with high quality and diversity. Specifically, in the generator construct, SCA blocks are introduced to adaptively recalibrate the varying importance of different channels and spatial regions. We also build a multiscale discriminator architecture to improve the visual quality of the generated samples. Finally, we compare the ROI extraction results before and after the augmentation. Our experimental results demonstrate that the proposed method can not only improve the performance of ROI extraction but also be superior to other classical generative methods.
Read full abstract