To predict the trends for fine-scale spread of Oncomelania hupensis based on supervised machine learning models in Shanghai Municipality, so as to provide insights into precision O. hupensis snail control. Based on 2016 O. hupensis snail survey data in Shanghai Municipality and climatic, geographical, vegetation and socioeconomic data relating to O. hupensis snail distribution, seven supervised machine learning models were created to predict the risk of snail spread in Shanghai, including decision tree, random forest, generalized boosted model, support vector machine, naive Bayes, k-nearest neighbor and C5.0. The performance of seven models for predicting snail spread was evaluated with the area under the receiver operating characteristic curve (AUC), F1-score and accuracy, and optimal models were selected to identify the environmental variables affecting snail spread and predict the areas at risk of snail spread in Shanghai Municipality. Seven supervised machine learning models were successfully created to predict the risk of snail spread in Shanghai Municipality, and random forest (AUC = 0.901, F1-score = 0.840, ACC = 0.797) and generalized boosted model (AUC= 0.889, F1-score = 0.869, ACC = 0.835) showed higher predictive performance than other models. Random forest analysis showed that the three most important climatic variables contributing to snail spread in Shanghai included aridity (11.87%), ≥ 0 °C annual accumulated temperature (10.19%), moisture index (10.18%) and average annual precipitation (9.86%), the two most important vegetation variables included the vegetation index of the first quarter (8.30%) and vegetation index of the second quarter (7.69%). Snails were more likely to spread at aridity of < 0.87, ≥ 0 °C annual accumulated temperature of 5 550 to 5 675 °C, moisture index of > 39% and average annual precipitation of > 1 180 mm, and with the vegetation index of the first quarter of > 0.4 and the vegetation index of the first quarter of > 0.6. According to the water resource developments and township administrative maps, the areas at risk of snail spread were mainly predicted in 10 townships/subdistricts, covering the Xipian, Dongpian and Tainan sections of southern Shanghai. Supervised machine learning models are effective to predict the risk of fine-scale O. hupensis snail spread and identify the environmental determinants relating to snail spread. The areas at risk of O. hupensis snail spread are mainly located in southwestern Songjiang District, northwestern Jinshan District and southeastern Qingpu District of Shanghai Municipality.
Read full abstract