Abstract

COVID-19 has swept through the world since December 2019 and caused a large number of patients and deaths. Spatial prediction on the spread of the epidemic is greatly important for disease control and management. In this study, we predicted the cumulative confirmed cases (CCCs) from Jan 17 to Mar 1, 2020, in mainland China at the city level, using machine learning algorithms, geographically weighted regression (GWR), and partial least squares regression (PLSR) based on population flow, geolocation, meteorological, and socioeconomic variables. The validation results showed that machine learning algorithms and GWR achieved good performances. These models could not effectively predict CCCs in Wuhan, the first city that reported COVID-19 cases in China, but performed well in other cities. Random Forest (RF) outperformed other methods with a CV‐R2 of 0.84. In this model, the population flow from Wuhan to other cities (WP) was the most important feature and the other features also made considerable contributions to the prediction accuracy. Compared with RF, GWR showed a slightly worse performance (CV‐R2 = 0.81) but required fewer spatial independent variables. This study explored the spatial prediction of the epidemic based on multisource spatial independent variables, providing references for the estimation of CCCs in the regions lacking accurate and timely.

Highlights

  • Since December 2019, a novel coronavirus named COVID19 was first reported in Wuhan, China, and swept across China

  • partial least squares regression (PLSR) extracts the principal components of independent variables and uses Canonical correlation analysis (CCA) and multiple linear regression (MLP) to generate the prediction model

  • This study explored the potential of machine learning algorithms in the spatial prediction of COVID-19 and compared them with PLSR and geographically weighted regression (GWR)

Read more

Summary

Introduction

Since December 2019, a novel coronavirus named COVID19 was first reported in Wuhan, China, and swept across China. (1) Some scholars investigated the influence of meteorological factors on the transmission of COVID-19 [5,6,7,8,9]. They collected meteorological factors such as temperature and humidity, developed models to evaluate the influence of these factors on the number of cases or deaths. (3) Other studies predicted the spread of COVID-19 based on historical case data using Infectious disease dynamics models [13,14,15,16,17] and machine learning algorithms [18, 19].

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.