This study presents an automated machine learning (AutoML) framework to predict the casualty rate and direct economic loss induced by earthquakes. The AutoML framework enables automated combined algorithm selection and hyperparameter tuning (CASH), reducing the manual works in the model development. The proposed AutoML framework includes 5 modules: data collection, data preprocessing, CASH, loss prediction, and model analysis. The AutoML models are learned from the dataset that is composed of earthquake information and social indicators. The optimal algorithm and hyperparameter setting of models are determined by the CASH module. A two-step model including a classifier and a regression model is designed for the casualty rate to address zero-casualty cases and also minimize their impacts on data distribution. The proposed AutoML framework is implemented on the seismic loss dataset of mainland China to demonstrate its practicability. A comparison study is conducted to show the high predictive abilities of the AutoML model compared with the traditional seismic risk model and other AutoML models. Models learned from the complete dataset achieve the ultimate performance compared with subsets that are composed of partial features. The model interpretation results indicate that earthquake magnitude, position, and population density are leading indicators for loss prediction.
Read full abstract