Over the past decade, many excellent data sharing efforts have enriched the remote sensing scene classification (SC) methods. These datasets have achieved great success in complex high-level semantic information interpretation. However, most existing datasets are collected from standard and ungeoreferenced image patches for algorithm training and evaluation. These datasets do not fit for practical applications and cannot be directly applied in further geographical study. Accordingly, we provide a large range high-resolution SC dataset with multiple time phases, called “ <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">W</b> u <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">h</b> an <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</b> ulti <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a</b> pplication <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">V</b> HR <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</b> cene classification dataset (WH-MAVS).” It facilitates the study of SC and scene change detection (SCD) algorithms. Moreover, it can also be directly employed to perform a variety of real-life land use application tasks. To the best of our knowledge, this is the first free, publicly available, georeferenced, and annotated dataset to cover almost an entire megacity. The WH-MAVS was collected and annotated from Google Earth imagery with the same spatial resolution and uniform nonoverlapping patch size, covering the central area of Wuhan, China. The total number of scene samples is 47 137, which belong to 14 classes with 23 567 labeled patches for each time phase in 2014 and 2016, respectively. The geographic coordinates of all samples in both time phases exhibit one-to-one correspondence with 23 202 unchanged image patches of scene categories and 365 changed ones. The distribution of the number of samples in each class is highly imbalanced; moreover, there are large intraclass differences and indistinguishable interclass variances. These characteristics are closer to the real land use/land cover application tasks and introduce further challenges to the related algorithm research. In addition, we conducted benchmark experiments on SC and SCD based on the WH-MAVS dataset with widely used deep learning models. DenseNet169 was found to achieve the best performance. The overall accuracies are 91.07% and 92.09%, respectively, in the 2014 and 2016 validation sets of WH-MAVS. Furthermore, SCD obtained by DenseNet169 has a binary change detection accuracy of 89.56% and a multiple (from–to) change detection accuracy of 86.70%. Over and above the research value of the algorithm, it is also proven to have practical applications in fields such as urban planning, landscape pattern analysis, and urban dynamic monitoring and analysis.
Read full abstract