Abstract

To solve the problem of high-dimensional variables and characteristic wavelengths selection on soil organic matter content estimation using hyperspectral data, a hybrid feature selection method that combined random forest and self-adaptive searching method was proposed. In this hybrid method, random forest was employed to select spectral variables as the preliminary optimal dataset, which had great importance in the modeling process. The wrapper approach which combined genetic algorithm and binary particle swarm optimization was used as the self-adaptive searching algorithm to further search variables in the preliminary dataset. As for the prediction model, random forest was picked on because of the strong robustness and the excellent performance of dealing with high-dimensional variables. In this paper, the soil samples collected in the typical black soil region were used as the research object, and the Vis-NIR spectral data of the soil obtained from ASD spectrometer and the organic matter content through chemical analysis were used as the data sources. Following reflectance transformation and spectral resampling, the proposed hybrid selection method was employed to extract the characteristic spectral regions that were used as the input data for random forest. The prediction accuracy was compared with the results from random forest algorithm with the spectral datasets which were respectively extracted by no-selected method, only random forest method and only self-adaptive searching method. The results showed that using random forest model with the characteristic wavelengths extracted by proposed method obtained the highest predicted accuracy, in which the R-2, RMSE and the RPD were 0. 838, 0. 54% and 2. 534, respectively. Moreover, the proposed method was more efficient to selected features than other approaches. It can be concluded that the hybrid feature selection method and random forest algorithm can be effectively applied to black soil organic matter content estimation using hyperspectral data and it also provides a reference for solving the problem of variables selection and modeling on other types of soil organic matter content estimation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.