Short-Range Forecasting of COVID-19 During Early Onset at County, Health District, and State Geographic Levels Using Seven Methods: Comparative Forecasting Study.

Christopher J Lynch,Ross Gore

doi:10.2196/24925

Abstract

BackgroundForecasting methods rely on trends and averages of prior observations to forecast COVID-19 case counts. COVID-19 forecasts have received much media attention, and numerous platforms have been created to inform the public. However, forecasting effectiveness varies by geographic scope and is affected by changing assumptions in behaviors and preventative measures in response to the pandemic. Due to time requirements for developing a COVID-19 vaccine, evidence is needed to inform short-term forecasting method selection at county, health district, and state levels.ObjectiveCOVID-19 forecasts keep the public informed and contribute to public policy. As such, proper understanding of forecasting purposes and outcomes is needed to advance knowledge of health statistics for policy makers and the public. Using publicly available real-time data provided online, we aimed to evaluate the performance of seven forecasting methods utilized to forecast cumulative COVID-19 case counts. Forecasts were evaluated based on how well they forecast 1, 3, and 7 days forward when utilizing 1-, 3-, 7-, or all prior–day cumulative case counts during early virus onset. This study provides an objective evaluation of the forecasting methods to identify forecasting model assumptions that contribute to lower error in forecasting COVID-19 cumulative case growth. This information benefits professionals, decision makers, and the public relying on the data provided by short-term case count estimates at varied geographic levels.MethodsWe created 1-, 3-, and 7-day forecasts at the county, health district, and state levels using (1) a naïve approach, (2) Holt-Winters (HW) exponential smoothing, (3) a growth rate approach, (4) a moving average (MA) approach, (5) an autoregressive (AR) approach, (6) an autoregressive moving average (ARMA) approach, and (7) an autoregressive integrated moving average (ARIMA) approach. Forecasts relied on Virginia’s 3464 historical county-level cumulative case counts from March 7 to April 22, 2020, as reported by The New York Times. Statistically significant results were identified using 95% CIs of median absolute error (MdAE) and median absolute percentage error (MdAPE) metrics of the resulting 216,698 forecasts.ResultsThe next-day MA forecast with 3-day look-back length obtained the lowest MdAE (median 0.67, 95% CI 0.49-0.84, P<.001) and statistically significantly differed from 39 out of 59 alternatives (66%) to 53 out of 59 alternatives (90%) at each geographic level at a significance level of .01. For short-range forecasting, methods assuming stationary means of prior days’ counts outperformed methods with assumptions of weak stationarity or nonstationarity means. MdAPE results revealed statistically significant differences across geographic levels.ConclusionsFor short-range COVID-19 cumulative case count forecasting at the county, health district, and state levels during early onset, the following were found: (1) the MA method was effective for forecasting 1-, 3-, and 7-day cumulative case counts; (2) exponential growth was not the best representation of case growth during early virus onset when the public was aware of the virus; and (3) geographic resolution was a factor in the selection of forecasting methods.

Highlights

The scientific community responded quickly to the global outbreak following COVID-19’s identification in December of 2019 [1,2]
The next-day moving average (MA) forecast with 3-day look-back length obtained the lowest median absolute error (MdAE) and statistically significantly differed from 39 out of 59 alternatives (66%) to 53 out of 59 alternatives (90%) at each geographic level at a significance level of
This study explores the error levels of seven common forecasting methods in estimating COVID-19 cumulative case counts at county, health district, and state levels over the upcoming week

Summary

Introduction

The scientific community responded quickly to the global outbreak following COVID-19’s identification in December of 2019 [1,2]. Numerous platforms and studies have been created to forecast the spread of the pandemic and meet the need for intervention measures in support of public health and awareness [1,3,4,5]. To inform public health and support awareness of proper forecast interpretation when making decisions, it is important to understand the basics of the generation of forecasts and the boundaries under which their interpretations are valid. To this point, this study explores the error levels of seven common forecasting methods in estimating COVID-19 cumulative case counts at county, health district, and state levels over the upcoming week. Due to time requirements for developing a COVID-19 vaccine, evidence is needed to inform short-term forecasting method selection at county, health district, and state levels

Methods

Results

Discussion

Conclusion