Abstract

Conventional remote sensing image retrieval (RSIR) systems usually perform single-label retrieval where each image is annotated by a single label representing the most significant semantic content of the image. This assumption, however, ignores the complexity of remote sensing images, where an image might have multiple classes (i.e., multiple labels), thus resulting in worse retrieval performance. We therefore propose a novel multi-label RSIR approach with fully convolutional networks (FCN). In our approach, we first train a FCN model using a pixel-wise labeled dataset,and the trained FCN is then used to predict the segmentation maps of each image in the considered archive. We finally extract region convolutional features of each image based on its segmentation map.The region features can be either used to perform region-based retrieval or further post-processed to obtain a feature vector for similarity measure. The experimental results show that our approach achieves state-of-the-art performance in contrast to conventional single-label and recent multi-label RSIR approaches.

Highlights

  • The recent advances in satellite technology resulted in a considerable volume of remote sensing (RS) image archives

  • An remote sensing image retrieval (RSIR) system generally has two main parts: 1) feature extraction in which the images are described and represented by a set of image features and 2) similarity measure in which the query image is matched with the rest images in the archive to retrieve the most similar images, but the remote sensing community has been focused mainly on developing discriminative image features due to the fact that retrieval performance greatly depends on the effectiveness of extracted features

  • Single labels are sufficient for RS problems with simple image classes, such as distinguishing between a building and a river, but multiple labels are required for distinguishing more complex image categories, such as dense residential and medium residential, where the differences only lie in the density of the buildings

Read more

Summary

Introduction

The recent advances in satellite technology resulted in a considerable volume of remote sensing (RS) image archives. The scale invariant feature transform (SIFT) descriptors are extracted and aggregated by BoVW to generate compact features for RSIR in [6] These RSIR methods mentioned above can achieve reliable performance, they are essentially single-label approaches. In [7], an image scene semantic matching scheme is proposed for multi-label RSIR, in which an object-based support vector machine (SVM) classifier is used to obtain classification maps of images in the archive, and in the other work [8], image visual, object, and semantic features are combined to perform a coarse-to-fine retrieval of RS images from multiple sensors. In a recent work [10], semi supervised graph-theoretic method is introduced for multi-label RSIR, in which only a small number of images are manually labeled for training These multi-label RSIR methods generally achieve better performance than single-label ones, since multiple labels can provide extra semantic information. Our work, we use the MatConvNet [12] based FCN package to build and train our FCN

Image Segmentation by FCN
Region Convolutional Feature Extraction
Experiments and Analysis
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.