Abstract

With the advances of deep learning, many recent CNN-based methods have yielded promising results for image classification. In very high-resolution (VHR) remote sensing images, the contributions of different regions to image classification can vary significantly, because informative areas are generally limited and scattered throughout the whole image. Therefore, how to pay more attention to these informative areas and better incorporate them over long distances are two main challenges to be addressed. In this article, we propose a gated recurrent multiattention neural network (GRMA-Net) to address these problems. Because informative features generally occur at multiple stages in a network (i.e., local texture features at shallow layers and global profile features at deep layers), we use multilevel attention modules to focus on informative regions to extract more discriminative features. Then, these features are arranged as spatial sequences and fed into a deep-gated recurrent unit (GRU) to capture long-range dependency and contextual relationship. We evaluate our method on the UC Merced (UCM), Aerial Image dataset (AID), NWPU-RESISC (NWPU), and Optimal-31 (Optimal) datasets. Experimental results have demonstrated the superior performance of our method as compared to other state-of-the-art methods.

Highlights

  • W ITH the development of satellite imaging sensors, very high-resolution (VHR) satellite images have become available for remote sensing (RS) scene classification [1]–[3] and promoted the prosperity of geospatial object detection [4], [5] land cover/land use classification [6], [7], and natural hazard detection [8]

  • It is worth noting that the improvements of OA scores achieved by our GRAM-Net on the Aerial Image dataset (AID) and NWPU datasets are significant

  • That is because the spatial resolution of the AID and NWPU datasets vary significantly

Read more

Summary

Introduction

W ITH the development of satellite imaging sensors, very high-resolution (VHR) satellite images have become available for remote sensing (RS) scene classification [1]–[3] and promoted the prosperity of geospatial object detection [4], [5] land cover/land use classification [6], [7], and natural hazard detection [8]. Diverse semantic categories, complex spatial information, and high intraclass and low interclass variations in VHR RS images introduce great challenges to accurate classification. It is Manuscript received January 4, 2021; revised May 25, 2021; accepted June 27, 2021. RS images generally have complex spatial structures. They usually cover a large-scale area with many types of objects. The irrelevant areas cannot be well suppressed This problem leads to misclassification of the network. Because of the long imaging distance, informative areas generally scatter around the whole image and exhibit complex spatial distribution. How to effectively aggregate these widely distributed features is the other problem to be solved

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call