Eucalyptus plantations are widespread in the highlands of northern Ethiopia. The species has been used for centuries for various purposes. However, there are controversies surrounding the species with excessive soil nutrient and water consumption. Modelling the spatial distribution of the species is fundamental to understand its ecological and hydrological effects in the region for policy inputs. Therefore, the purpose of this study is to develop a model for mapping the spatial distribution of Eucalyptus globulus. We used the spectral bands of Sentinel-2 data, vegetation indices, and environmental data as predictor variables and three machine learning algorithms (Random Forest, Support Vector Machine, and Boosted Regression Trees) to model the current distribution of Eucalyptus globulus. Eleven of the twenty-five predictor variables were filtered using a variance inflation factor (VIF). 419 in situ georeferenced data points were used for training, and validating the models. The area under the curve (AUC), kappa statistic (K), true skill statistic (TSS), Root Mean Squared Error and coefficient of determination (R2) were used to validate the models’ performance. The model validation metrics confirmed the highest performance of Random Forest. The prediction map of Random Forest revealed that Eucalyptus globulus was fairly detected in non-Eucalyptus globulus woody vegetation (R2 = 0.86, P < 0.001; RMSE = 0.31). We found that the Green Normalized Difference Vegetation Index and environmental variables, such as elevation and distance from the road, were the most important predictor variables in explaining the distribution of Eucalyptus globulus. Our findings demonstrate that machine learning algorithms with Sentinel-2 spectral bands and vegetation indices compounded with environmental data can effectively model the spatial distribution of Eucalyptus globulus.
Read full abstract