Abstract

Background: This study intends to identify the best model for predicting the incidence of hand, foot and mouth disease (HFMD) in Ningbo by comparing Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory Neural Network (LSTM) models combined and uncombined with exogenous meteorological variables. Methods: The data of daily HFMD incidence in Ningbo from January 2014 to November 2017 were set as the training set, and the data of December 2017 were set as the test set. ARIMA and LSTM models combined and uncombined with exogenous meteorological variables were adopted to fit the daily incidence of HFMD by using the data of the training set. The forecasting performances of the four fitted models were verified by using the data of the test set. Root mean square error (RMSE) was selected as the main measure to evaluate the performance of the models. Results: The RMSE for multivariate LSTM, univariate LSTM, ARIMA and ARIMAX (Autoregressive Integrated Moving Average Model with Exogenous Input Variables) was 10.78, 11.20, 12.43 and 14.73, respectively. The LSTM model with exogenous meteorological variables has the best performance among the four models and meteorological variables can increase the prediction accuracy of LSTM model. For the ARIMA model, exogenous meteorological variables did not increase the prediction accuracy but became the interference factor of the model. Conclusions: Multivariate LSTM is the best among the four models to fit the daily incidence of HFMD in Ningbo. It can provide a scientific method to build the HFMD early warning system and the methodology can also be applied to other communicable diseases.

Highlights

  • The Autoregressive Integrated Moving Average (ARIMA) is an adaptation of discrete time-filtering methods developed in the 1930–1940s by electrical engineers [15]

  • Note: SD stands for standard deviation, Min stands for minimum value, Max stands for maximum value, P25 stands for 25th percentile, P50 stands for 50th percentile and P75 stands for 75th percentile; Tmean stands for daily mean temperature, Pmean stands for daily mean pressure, RHmean stands for daily mean relative humidity and WSmean stands for daily mean wind speed and PPTN stands for daily precipitation

  • The daily mean temperature and daily mean pressure were used as exogenous variables, and four models were used to predict the daily incidence of HFMD

Read more

Summary

Introduction

LSTM is a special case of Recurrent Neural Networks (RNN) and is increasing in use in recent years in domains such as stocks [16], speech recognition [17], and disease prediction, such as HFMD [18], COVID-19 [19] and HIV [20]. Both ARIMA and LSTM are suitable for analyzing time series data and making predictions. This study intends to identify the best model for predicting HFMD incidence in Ningbo by comparing ARIMA and LSTM models combined and uncombined with exogenous meteorological variables, based on the daily incidence of HFMD and daily meteorological data from 2014 to 2017 in Ningbo

Study Area
HFMD Incidence and Meteorological Data
ARIMA Model
LSTM Model
One Step Ahead Rolling Forecast
Descriptive Analysis
ARIMA and ARIMAX Model
Univariate LSTM and Multivariable LSTM Model
Prediction Performance Comparison
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call