Abstract

Studies have demonstrated the robust performance of the ensemble machine learning classifier, random forests, for remote sensing land cover classification, particularly across complex landscapes. This study introduces new ensemble margin criteria to evaluate the performance of Random Forests (RF) in the context of large area land cover classification and examines the effect of different training data characteristics (imbalance and mislabelling) on classification accuracy and uncertainty. The study presents a new margin weighted confusion matrix, which used in combination with the traditional confusion matrix, provides confidence estimates associated with correctly and misclassified instances in the RF classification model. Landsat TM satellite imagery, topographic and climate ancillary data are used to build binary (forest/non-forest) and multiclass (forest canopy cover classes) classification models, trained using sample aerial photograph maps, across Victoria, Australia. Experiments were undertaken to reveal insights into the behaviour of RF over large and complex data, in which training data are not evenly distributed among classes (imbalance) and contain systematically mislabelled instances. Results of experiments reveal that while the error rate of the RF classifier is relatively insensitive to mislabelled training data (in the multiclass experiment, overall 78.3% Kappa with no mislabelled instances to 70.1% with 25% mislabelling in each class), the level of associated confidence falls at a faster rate than overall accuracy with increasing amounts of mislabelled training data. In general, balanced training data resulted in the lowest overall error rates for classification experiments (82.3% and 78.3% for the binary and multiclass experiments respectively). However, results of the study demonstrate that imbalance can be introduced to improve error rates of more difficult classes, without adversely affecting overall classification accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.