Abstract
This paper proposed the forecasting model of Influenza-like Illness (ILI) and respiratory disease. The dataset was extracted from the Taiwan Environmental Protection Administration (EPA) for air pollutants data and the Centers for Disease Control (CDC) for disease cases from 2009 to 2018. First, this paper applied the ARIMA method, which trained based on the weekly number of disease cases in time series. Second, we implemented the Long short-term memory (LSTM) method, which trained based on the correlation between the weekly number of diseases and air pollutants. The models were also trained and evaluated based on five and ten years of historical data. Autoregressive integrated moving average (ARIMA) has an excellent model in the five-year dataset of ILI at 2564.9 compared to ten years at 8173.6 of RMSE value. This accuracy is similar to the Respiratory dataset, which gets 15,656.7 in the five-year dataset and 22,680.4 of RMSE value in the ten-year dataset. On the contrary, LSTM has better accuracy in the ten-year dataset than the five-year dataset. For example, on average of RMSE in the ILI dataset, LSTM has 720.2 RMSE value in five years and 517.0 in ten years dataset. Also, in the Respiratory disease dataset, LSTM gets 4768.6 of five years of data and 3254.3 of the ten-year dataset. These experiments revealed that the LSTM model generally outperforms ARIMA by three to seven times higher model performance.
Highlights
We empirically study the training model of Autoregressive integrated moving average (ARIMA) and Long short-term memory (LSTM) with the input of air pollutants, and the output is the number of diseases predicted
These analyses revealed that the LSTM model typically outperforms ARIMA by three to seven times higher accuracy
The influenza-like illness and respiratory disease were predicted by comparative models of ARIMA and LSTM for 5 and 10 year periods
Summary
The primary factor of global death and disease leads to outdoor air pollutants. Health Organization (WHO) addressed that around 4.2 million premature deaths cases are correlated with air pollution [2,3]. The study of modeling the correlation of air pollutants and diseases such as Influenza-like Illness (ILI) and respiratory illness is notable [4]. The residuals are a metric about how far these data points are from the regression line, and the RMSE reflects how wide the spread out of these residuals are. In other words, it indicates how narrowly the information is concentrated on the best-fit line. Root mean square error is commonly used to assess experimental findings in climatology, forecasting, and regression analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Environmental Research and Public Health
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.