Machine learning (ML) has been extensively applied to model geohazards, yielding tremendous success. However, researchers and practitioners still face challenges in enhancing the reliability of ML models. In the present study, a systematic framework combining k-fold cross-validation (CV), metaheuristics (MHs), support vector regression (SVR), and Friedman and Nemenyi tests was proposed to improve the reliability and performance of geohazard modeling. The average normalized mean square error (NMSE) from k-fold CV sets was adopted as the fitness metric. Twenty of the most well-established MHs and the most recent MHs were adopted to tune the hyperparameters of SVR and were evaluated through nonparametric Friedman and post hoc Nemenyi tests to identify significant differences. Observations from a typical reservoir landslide were selected as a benchmark dataset, and the accuracy, robustness, computational time, and convergence speed of the MHs were compared. Significant performance differences among the twenty MHs were identified by Friedman and post hoc Nemenyi tests of the mean absolute error (MAE), root mean squared error (RMSE), Kling–Gupta efficiency (KGE), and computational time, with p values lower than 0.05. The comparison of results demonstrated that the multiverse optimizer (MVO) is among the highest-performing, most stable, and computationally efficient algorithms, providing superior performance to other methods, with nearly optimum values of the correlation coefficient (R), a low MAE (23.5086 versus 23.9360), a low mean RMSE (48.6946 versus 50.1882), and a high mean KGE (0.9803 versus 0.9893) in predicting the displacement of the Shuping landslide. This paper considerably enriches the literature regarding hyperparameter optimization algorithms and the enhancement of their reliability. In addition, Friedman and post hoc Nemenyi tests have the potential for evaluating and comparing various ML-based geohazard models.
Read full abstract