Abstract
Convolutional neural network (CNN)-based deep learning (DL) is a powerful, recently developed image classification approach. With origins in the computer vision and image processing communities, the accuracy assessment methods developed for CNN-based DL use a wide range of metrics that may be unfamiliar to the remote sensing (RS) community. To explore the differences between traditional RS and DL RS methods, we surveyed a random selection of 100 papers from the RS DL literature. The results show that RS DL studies have largely abandoned traditional RS accuracy assessment terminology, though some of the accuracy measures typically used in DL papers, most notably precision and recall, have direct equivalents in traditional RS terminology. Some of the DL accuracy terms have multiple names, or are equivalent to another measure. In our sample, DL studies only rarely reported a complete confusion matrix, and when they did so, it was even more rare that the confusion matrix estimated population properties. On the other hand, some DL studies are increasingly paying attention to the role of class prevalence in designing accuracy assessment approaches. DL studies that evaluate the decision boundary threshold over a range of values tend to use the precision-recall (P-R) curve, the associated area under the curve (AUC) measures of average precision (AP) and mean average precision (mAP), rather than the traditional receiver operating characteristic (ROC) curve and its AUC. DL studies are also notable for testing the generalization of their models on entirely new datasets, including data from new areas, new acquisition times, or even new sensors.
Highlights
The importance of assessment of the accuracy of remote sensing (RS) thematic classification has been recognized since the early days of remote sensing [1,2,3,4,5,6,7,8,9,10,11]
We reviewed 100 randomly-selected papers focusing on deep learning (DL) classification that were published in eight major RS journals in 2020
The review of these papers confirms that the RS DL community have largely abandoned traditional RS accuracy assessment terminology
Summary
The importance of assessment of the accuracy of remote sensing (RS) thematic classification has been recognized since the early days of remote sensing [1,2,3,4,5,6,7,8,9,10,11]. There is a general consensus regarding the importance of unbiased, randomized sampling to support the generation of summary accuracy data, normally presented in the form of a table called the confusion matrix, or error matrix. This table forms the basis for calculating summary metrics, most commonly the overall accuracy (OA), the Kappa statistic (though the use of this statistic has been challenged [12,13]), and the class-specific statistics of user’s (UA) and producer’s accuracy (PA). Note that that spatial spatial context, context, spectral information, edge, and textural information is highlighted by different spectral information, edge, and textural information is highlighted by different filters filters and and different ofof data abstraction This modeling of different convolutional convolutionallayers, layers,allowing allowinga ahigh highdegree degree data abstraction.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.