For tropical rainforest regions with dense vegetation cover, the development of effective large-scale soil mapping methods is crucial to improve soil management practices to replace the time-consuming and laborious conventional approaches. While machine learning (ML) algorithms demonstrate superior predictability of soil properties over linear models, their practical and automated application for predicting soil properties using remote sensing data requires further assessment. Therefore, this study aims to integrate Unmanned Aerial Vehicles (UAVs)-based hyperspectral images and Light Detection and Ranging (LiDAR) points to predict the soil properties indirectly in two tropical rainforest mountains (Diaoluo and Limu) in Hainan Province, China. A total of 175 features, including texture features, vegetation indices, and forest parameters, were extracted from two study sites. Six ML models, Partial Least Squares Regression (PLSR), Random Forest (RF), Adaptive Boosting (AdaBoost), Gradient Boosting Decision Trees (GBDT), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP), were constructed to predict soil properties, including soil acidity (pH), total nitrogen (TN), soil organic carbon (SOC), and total phosphorus (TP). To enhance model performance, a Bayesian optimization algorithm (BOA) was introduced to obtain optimal model hyperparameters. The results showed that compared with the default parameter tuning method, BOA always improved models’ performances in predicting soil properties, achieving average R2 improvements of 202.93%, 121.48%, 8.90%, and 38.41% for soil pH, SOC, TN, and TP, respectively. In general, BOA effectively determined the complex interactions between hyperparameters and prediction features, leading to an improved model performance of ML methods compared to default parameter tuning models. The GBDT model generally outperformed other ML methods in predicting the soil pH and TN, while the XGBoost model achieved the highest prediction accuracy for SOC and TP. The fusion of hyperspectral images and LiDAR data resulted in better prediction of soil properties compared to using each single data source. The models utilizing the integration of features derived from hyperspectral images and LiDAR data outperformed those relying on one single data source. In summary, this study highlights the promising combination of UAV-based hyperspectral images with LiDAR data points to advance digital soil property mapping in forested areas, achieving large-scale soil management and monitoring.
Read full abstract