The study and prediction of soil liquefaction is an important and complex issue in geotechnical earthquake engineering. This paper attempts to compare the predictability of soil liquefaction potential between several machine learning classification models, which includes some tree-based classifiers, multilayer perceptron (MLP) neural networks, Support Vector Machine (SVM), some state-of-the-art ensemble methods, K nearest neighbors method, classical Naive Bayesian classifier and logistic regression. Three data sets covering shear-wave velocity measurements, cone penetration testing (CPT), and real historic earthquakes cases are employed to train and evaluate the machine learning classifiers. In order to make the best use of large varieties of statistical and machine learning classification algorithms, it is necessary to give a comparative evaluation of the model performance before model selection and offer advice on a unified stable model for all sorts of collected datasets. In the comparative study, data preprocessing is first performed to ensure the dataset into all machine models is of good quality. Then all three datasets with different input features are passed into the machine learning algorithms to obtain its confusion matrix and some evaluation indices. Reliable assessment of model performance is done through a repeated sub-sampling process. Experimental results are also supported by ROC curves. The results of this study indicated that although most machine learning methods are able to represent the complex relationship between seismic proper seismic properties of soils and corresponding liquefaction potential, ensemble learning has achieved more successful results in all three datasets test and can be a fairly promising approach on prediction of earthquake-induced soil liquefaction.
Read full abstract