Integrating Satellite Images and Machine Learning for Flood Prediction and Susceptibility Mapping for the Case of Amibara, Awash Basin, Ethiopia

Gizachew Kabite Wedajo,Paolo Gamba,Tesfaye Fufa,Tesfaye Fufa,Tsegaye Demisis Lemma,Tsegaye Demisis Lemma

doi:10.3390/rs16122163

Abstract

Flood is one of the most destructive natural hazards affecting the environment and the socioeconomic system of the world. The effects are higher in the developing countries due to their higher vulnerability to disaster and limited coping capacity. The Awash basin is one of the flood-prone basins in Ethiopia where the frequency and severity of flooding has been increasing. Amibara district is one of the flood-affected areas in the Awash basin. To minimize the effects of flooding, reliable and up-to-date information on flooding is highly required. However, flood monitoring and forecasting systems are lacking in most basins of Ethiopia including the Awash basin. Therefore, this study aimed to (i) identify important flood causative factors, (ii) evaluate the performance of random forest (RF), linear regression, support vector machine (SVM), and long short-term memory (LSTM) machine learning models for flood prediction and susceptibility mapping in the Amibara area. For developing flood prediction and susceptibility modeling, nine causative factors were considered, namely elevation, slope, aspect, curvature, topographic wetness index, soil texture, rainfall, land use/land cover, and curve number. The Pearson correlation coefficient and information gain ratio (InGR) techniques were used to evaluate the relative importance of the factors. The machine learning models were trained and tested using 400 historic flood points collected from the 10 September 2020 Sentinel 2 image, during which a flood event occurred in the area. Multiple metrics, namely precession, recall, F1-score, accuracy, and receiver operating characteristics (area under curve), were used to evaluate the performance of the models. The results showed that all the factors considered in this study were important; elevation, rainfall, topographic wetness index, aspect, and slope were more important while land use/land cover, curve number, curvature, and soil texture were less important. Furthermore, the results showed that random forest outperformed in predicting and mapping flooding for the study area whereas the linear regression model showed the next best performance to RF. However, SVM performed poorly in flood prediction and susceptibility mapping. The integration of satellite and field datasets coupled with state-of-the-art-machine learning models are novel approaches and thus improved the accuracy of flood prediction and susceptibility mapping. Such methodology improves the state-of-the-art knowledge in this field and fills the gaps of traditional flood mapping techniques. Thus, the results of the study can provide crucial information for informed decision-making in the processes of designing flood control strategies and risk management.

Full Text