Abstract
Multi-label classification plays a momentous role in perceiving intricate contents of an aerial image and triggers several related studies over the last years. However, most of them deploy few efforts in exploiting label relations, while such dependencies are crucial for making accurate predictions. Although an LSTM layer can be introduced to modeling such label dependencies in a chain propagation manner, the efficiency might be questioned when certain labels are improperly inferred. To address this, we propose a novel aerial image multi-label classification network, attention-aware label relational reasoning network. Particularly, our network consists of three elemental modules: 1) a label-wise feature parcel learning module, 2) an attentional region extraction module, and 3) a label relational inference module. To be more specific, the label-wise feature parcel learning module is designed for extracting high-level label-specific features. The attentional region extraction module aims at localizing discriminative regions in these features and yielding attentional label-specific features. The label relational inference module finally predicts label existences using label relations reasoned from outputs of the previous module. The proposed network is characterized by its capacities of extracting discriminative label-wise features in a proposal-free way and reasoning about label relations naturally and interpretably. In our experiments, we evaluate the proposed model on the UCM multi-label dataset and a newly produced dataset, AID multi-label dataset. Quantitative and qualitative results on these two datasets demonstrate the effectiveness of our model. To facilitate progress in the multi-label aerial image classification, the AID multi-label dataset will be made publicly available.
Highlights
R ECENT advancements of remote sensing techniques have boosted the volume of attainable high-resolution aerial images, and massive amounts of applications, such as urban cartography [1]–[4], traffic monitoring [5]–[7], terrain surface analysis [8]–[11], and ecological scrutiny [12], [13], have benefited from these developments
1) We propose a novel multilabel aerial image classification network, attention-AL-RN-convolutional neural network (CNN), which consists of three imperative components: a label-wise feature parcel learning module, an attentional region extraction module, and a label relational inference module
The extraction of high-level features is crucial for visual recognition tasks, and many recent studies adopt CNNs owing to their remarkable performance in learning such features [15], [51]–[56]
Summary
R ECENT advancements of remote sensing techniques have boosted the volume of attainable high-resolution aerial images, and massive amounts of applications, such as urban cartography [1]–[4], traffic monitoring [5]–[7], terrain surface analysis [8]–[11], and ecological scrutiny [12], [13], have benefited from these developments. The classification of aerial images refers to assigning these images with specific labels according to their semantic contents, and a common hypothesis shared by many relevant studies is that an image should be labeled with only one semantic category, such as scene categories (see Fig. 1). The acquisition of requisite groundtruths (i.e., pixel-wise annotations and bounding boxes) demands enormous expertise and human labor, which makes relevant data sets expensive and difficult to access With this intention, multilabel image classification attracts increasing attention in the remote sensing community [30]–[34] owing to that 1) a comprehensive picture of aerial image contents can be drawn and 2) data sets required in this task are not expensive (only image-level labels are needed).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Geoscience and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.