BackgroundIn this paper, we conduct an analysis of the COVID‐19 data in the United States in 2020 via functional data analysis methods. Through this research, we investigate the effectiveness of the practice of public health measures, and assess the correlation between infections and deaths caused by the COVID‐19. Additionally, we look into the relationship between COVID‐19 spread and geographical locations, and propose a forecasting method to predict the total number of confirmed cases nationwide.MethodsThe functional data analysis methods include functional principal analysis methods, functional canonical correlation analysis methods, an expectation‐maximization (EM) based clustering algorithm and a functional time series model used for forecasting.ResultsIt is evident that the practice of public health measures helps to reduce the growth rate of the epidemic outbreak over the nation. We have observed a high canonical correlation between confirmed and death cases. States that are geographically close to the hot spots are likely to be clustered together, and population density appears to be a critical factor affecting the cluster structure. The proposed functional time series model gives more reliable and accurate predictions of the total number of confirmed cases than standard time series methods.ConclusionsThe results obtained by applying the functional data analysis methods provide new insights into the COVID‐19 data in the United States. With our results and recommendations, the health professionals can make better decisions to reduce the spread of the epidemic, and mitigate its negative effects to the national public health.
Read full abstract