ABSTRACT Soil organic carbon (SOC) dataset augmentation, which enables the comprehensive monitoring of carbon sinks at regional and global scales, is vital for global carbon cycle management and soil fertility. SOC maps built by conventional laboratory or field measurements are time- and cost-consuming and especially difficult in forests. A new approach to build SOC maps with good accuracy and time efficiency and promptly respond to changes in SOC dynamics is, therefore, being identified. This study aimed to evaluate the ability of SOC estimation using a multiple linear regression model (MLR) and four machine learning algorithms: artificial neural networks (ANN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) with satellite data sources and soil nutrient indicator data to find the optimal method. The results indicate that the SVM and XGBoost models demonstrated the best predictive abilities (R2 = 0.70 and 0.74, %RMSE = 8.8 and 8.3, MAE = 0.176 and 0.155) when using remote sensing variables and soil property variables, respectively. Band7_IDM, Band 5, Band4_IDM, RVI, NSMI, NDVI, Band 6, Band 7, and Band 4 were the most valuable variables in the SVM model, while NDVI, DVI, GVMI, NSMI, and Band4_Dive in the XGBoost model. The SVM model using remote sensing data may be applied to build SOC maps in Vietnamese forests instead of using conventional methods with a high accuracy (R2 = 0.74).
Read full abstract