Hyper‐parameter optimization of gradient boosters for flood susceptibility analysis

Tuan Anh Lai,Ngoc‐Thach Nguyen,Quang‐Thanh Bui

doi:10.1111/tgis.13023

Abstract

AbstractProperly choosing hyper‐parameters improves machine learning models' performance and reduces training time and resource requirements. In this study, we investigated the uses of the Bayesian optimization algorithm for hyper‐parameter searches of two classifiers, namely LightGBM and XGBoost. The models were verified with a dataset from Vietnam, including historical flood locations from satellite images and survey data, and 11 features from three groups, namely physical, hydrological, and human‐related factors. The models' performance was evaluated using Area under Receiver Operating Characteristic curves (AUC‐ROC). Several strategies were applied to avoid over‐fitting, and the results show that two tuned Gradient boosters reached considerably high AUC values (approximately 0.98) compared with the previous study with a similar dataset. The model interpretation was also implemented using the Shapley (SHAP) values to understand better how models work and the interactions between features. The search for optimal hyper‐parameters is worth investigating in the future, particularly when there is growing work for novel optimization algorithms. The verification of such an approach is scientifically sound, and the models can be used as an alternative solution for natural hazard analysis in countries prone to hazards.

Full Text