The COVID-19 outbreak has significantly impacted human lifestyles and life patterns. Therefore, data related to human social life may tell us the increase or decrease in the number of confirmed COVID-19 cases. However, although the number of confirmed cases is affected by social life, it is difficult to find studies that attempt to predict the number of confirmed cases using various lifestyle data. This paper attempted an exploratory data analysis to see if the number of confirmed cases could be predicted more accurately by including various lifestyle data. We included taking public transportation, watching a movie at the cinema, and accommodation at a motel in the lifestyle data. Finally, a 'lifestyle addition' set was constructed that added lifestyle data to the number of past confirmed cases and search term frequency data. The deep learning algorithms used in the analysis are deep neural networks (DNNs) and recurrent neural networks (RNNs). Performance differences across data sets and between deep learning models were tested to be statistically significant. Among metropolitan cities in South Korea, Seoul (9.6 million) with the largest population and Busan (3.4 million) with the second largest population had the lowest error rate in 'lifestyle addition' set. When predicting with the 'lifestyle addition' set, in Seoul, the error rate was reduced to 20.1%, and in Busan, the graph of the actual number of confirmed cases and the predicted graph were almost identical. Through this study, we were able to identify three notable results that could contribute to predicting the number of patients infected with epidemic in the future.
Read full abstract