Abstract
COVID-19 has swept through the world since December 2019 and caused a large number of patients and deaths. Spatial prediction on the spread of the epidemic is greatly important for disease control and management. In this study, we predicted the cumulative confirmed cases (CCCs) from Jan 17 to Mar 1, 2020, in mainland China at the city level, using machine learning algorithms, geographically weighted regression (GWR), and partial least squares regression (PLSR) based on population flow, geolocation, meteorological, and socioeconomic variables. The validation results showed that machine learning algorithms and GWR achieved good performances. These models could not effectively predict CCCs in Wuhan, the first city that reported COVID-19 cases in China, but performed well in other cities. Random Forest (RF) outperformed other methods with a CV‐R2 of 0.84. In this model, the population flow from Wuhan to other cities (WP) was the most important feature and the other features also made considerable contributions to the prediction accuracy. Compared with RF, GWR showed a slightly worse performance (CV‐R2 = 0.81) but required fewer spatial independent variables. This study explored the spatial prediction of the epidemic based on multisource spatial independent variables, providing references for the estimation of CCCs in the regions lacking accurate and timely.
Highlights
Since December 2019, a novel coronavirus named COVID19 was first reported in Wuhan, China, and swept across China
partial least squares regression (PLSR) extracts the principal components of independent variables and uses Canonical correlation analysis (CCA) and multiple linear regression (MLP) to generate the prediction model
This study explored the potential of machine learning algorithms in the spatial prediction of COVID-19 and compared them with PLSR and geographically weighted regression (GWR)
Summary
Since December 2019, a novel coronavirus named COVID19 was first reported in Wuhan, China, and swept across China. (1) Some scholars investigated the influence of meteorological factors on the transmission of COVID-19 [5,6,7,8,9]. They collected meteorological factors such as temperature and humidity, developed models to evaluate the influence of these factors on the number of cases or deaths. (3) Other studies predicted the spread of COVID-19 based on historical case data using Infectious disease dynamics models [13,14,15,16,17] and machine learning algorithms [18, 19].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Computational and mathematical methods in medicine
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.