Abstract
A soil survey is the main source of soil spatial information and is very useful in supporting land use decisions, especially for agronomic purposes. However, the continuity of soil surveying in Brazil has been compromised due the minimal amount of government investment in the sector. The goal of this study was to evaluate the performance of three data mining methods—decision trees (DT), random forest (RF), and artificial neural networks (ANN)—with R software to predict soil map units from the Guapi-Macacu watershed in Rio de Janeiro State, Brazil. Terrain attributes derived from a digital elevation model (DEM), remote sensing data (Landsat 5), and a geology map were used as predictors. Evaluation of the models' performance was based on statistical indices obtained from a confusion matrix by using validation samples. Furthermore, a comparison was performed to analyze the agreement between the maps created from the models and from a conventional soil map (legacy data); sampling locations were not used as input data in the training process. The results indicated better performance from the RF model compared with the DT and ANN models. The accuracy was calculated through validation samples indicating values in the Kappa index corresponding to 0.76 (RF), 0.72 (ANN) and 0.66 (DT). A comparison between the maps obtained from the models and the conventional soil map exhibited better agreement with the RF model (68.7%), followed by the ANN (62.8%), and DT (62.3%) models. The classification accuracy, obtained through comparisons with sampling locations, exhibited correct classification of 57.6% in the RF model, 56.8% in the DT, and 56.1% in the ANN model; the conventional soil map had 69.7% correct classification. In general, most of the confusion between soil map units was observed in Haplic Gleysols, Haplic Acrisols (Clayic) and Fluvisols. In terms of covariates importance, in general, the terrain attributes derived from the DEM had more influence as predictor covariates than the indices obtained from the Landsat data. The application of data mining can improve the methods applied to soil map units, thereby contributing to increasing information related to soil-landscape relationships and enhancing the products of soil surveys.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.