MCAFNet: Multi-Channel Attention Fusion Network-Based CNN For Remote Sensing Scene Classification

Jingming Xia,Ling Tan,Yue Ding,Yao Zhou

doi:10.14358/pers.22-00121r2

Abstract

Remote sensing scene images are characterized by intra-class diversity and inter-class similarity. When recognizing remote sensing images, traditional image classification algorithms based on deep learning only extract the global features of scene images, ignoring the important role of local key features in classification, which limits the ability of feature expression and restricts the improvement of classification accuracy. Therefore, this paper presents a multi-channel attention fusion network (MCAFNet). First, three channels are used to extract the features of the image. The channel "spatial attention module" is added after the maximum pooling layer of two channels to get the global and local key features of the image. The other channel uses the original model to extract the deep features of the image. Second, features extracted from different channels are effectively fused by the fusion module. Finally, an adaptive weight loss function is designed to automatically adjust the losses in different types of loss functions. Three challenging data sets, UC Merced Land-Use Dataset (UCM), Aerial Image Dataset (AID), and Northwestern Polytechnic University Dataset (NWPU), are selected for the experiment. Experimental results show that our algorithm can effectively recognize scenes and obtain competitive classification results.

Full Text