Abstract

Currently most global land cover maps are produced with discrete classes, which express the dominant land cover class in each pixel, or a combination of several classes at a predetermined ratio. In contrast, land cover fraction mapping enables expressing the proportion of each pure class in each pixel, which increases precision and reduces legend complexity. To map land cover fractions, regression rather than classification algorithms are needed, and multiple approaches are available for this task.A major challenge for land cover fraction mapping models is data sparsity. Land cover fraction data is by its nature zero-inflated due to how common the 0% fraction is. As regression favours the mean, 0% and 100% fractions are difficult for regression models to predict accurately. We proposed a new solution by combining three models: a binary model determines whether a pixel is pure; if so, it is processed using a classification model; otherwise with a regression model.We compared multiple regression algorithms and implemented our proposed three-step model on the algorithm with the lowest RMSE. We further evaluated the spatial and per-class accuracy of the model and demonstrated a wall-to-wall prediction of seven land cover fractions over the globe. The models were trained on over 138,000 points and validated on a separate dataset of over 20,000 points, provided by the CGLS-LC100 project. Both datasets are global and aligned with the PROBA-V 100 m UTM grid.Results showed that the random forest regression model reached the lowest RMSE of 17.3%. Lowest MAE (7.9%) and highest overall accuracy (72% ± 2%) was achieved using random forest with our proposed three-model approach and median vote.This research proves that machine learning algorithms can be applied globally to map a wide variety of land cover fractions. Fraction mapping expresses land cover more precisely, and empowers users to create their own discrete maps using user-defined thresholds and rules, which enables customising the result for a diverse range of uses. The three-step approach is useful for addressing the zero-inflation issue and mapping 0% and 100% fractions more accurately, and thus has already been taken up in the operational production of global land cover fraction layers within the CGLS-LC100 project. Furthermore, this study contributes to the accuracy assessment of land cover fraction maps both thematically and spatially, and these methods could be taken up by future land cover fraction mapping efforts.

Highlights

  • Land cover, as one of the key variables for monitoring a number of Sustainable Development Goals (SDGs), has lately received more attention due to increased availability of higher spatial and temporal resolution satellite data

  • Regression model comparison showed that Random forest (RF) regression achieved the highest accuracy: by RMSE when using a single model and a mean vote (RMSE: 17.3%, MAE: 9.4%), and by MAE when using a median vote (RMSE: 20.7%, MAE: 7.9%)

  • We investigated ways to tackle the issue of accurately predicting the extreme fraction values of 0% and 100% by proposing a hierarchical multi-step approach combining classification and regression models

Read more

Summary

Introduction

As one of the key variables for monitoring a number of Sustainable Development Goals (SDGs), has lately received more attention due to increased availability of higher spatial and temporal resolution satellite data. Except for the cover fraction layers of the CGLS-LC100 product, all other global land cover products that include major land cover classes, such as the ones described by Bartholome and Belward (2005); Friedl et al (2010); Arino et al (2007); See et al (2015); Chen et al (2015), are provided with discrete classes ( known as “hard” or “crisp” classi­ fication), where each pixel of the map can only represent a single land cover class Such discrete classification oversimplifies reality, as mixed pixels that are covered by multiple land cover classes are a common occurrence. These system­ atic errors add up when scaling the result to the entire region

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call