Abstract
Abstract. Image classification is one of the main drivers of the rapid developments in deep learning with convolutional neural networks for computer vision. So is the analogous task of scene classification in remote sensing. However, in contrast to the computer vision community that has long been using well-established, large-scale standard datasets to train and benchmark high-capacity models, the remote sensing community still largely relies on relatively small and often application-dependend datasets, thus lacking comparability. With this paper, we present a classification-oriented conversion of the SEN12MS dataset. Using that, we provide results for several baseline models based on two standard CNN architectures and different input data configurations. Our results support the benchmarking of remote sensing image classification and provide insights to the benefit of multi-spectral data and multi-sensor data fusion over conventional RGB imagery.
Highlights
One of the most crucial preconditions for the development of machine learning models for the interpretation of remote sensing data is the availability of annotated datasets
We present the conversion of the SEN12MS dataset to the image classification purpose as well as a couple of baseline models including their evaluation
It is still interesting to note that for both convolutional neural network (CNN) architectures, data fusion provides the best result for this class, while multi-spectral imagery yields the worst result – even worse than RGB only
Summary
One of the most crucial preconditions for the development of machine learning models for the interpretation of remote sensing data is the availability of annotated datasets. The great success of deep learning was largely driven by the desire to solve the image classification problem, i.e. assigning one or more labels to a given photograph. For this purpose, many researchers have relied on the ImageNet database (Deng et al, 2009), which contains millions of annotated images. As can be seen from this non-complete selection, most datasets built for remote sensing image classification deal with high-resolution aerial imagery, usually providing three or four spectral channels (RGB, or RGB plus near-infrared). Sensing-specific models that can later be fine-tuned to individual problems and user needs
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.