This study is based on the application of an ARIMA (p, d, q) based machine learning (ML) approach to evaluate the dynamics of COVID-19 pandemic. The focus is on estimating epidemic trends and performing diagnostic scrutiny with model fitting. The data including all four waves of the pandemic pertaining to Pakistan, covering all four provinces (Sindh, Punjab, Khyber Pakhtunkhwa, Balochistan, as well as Gilgit Baltistan, Azad Jammu Kashmir, and the capital city Islamabad, collected from February 26, 2020, to September 30, 2021, is analyzed. The ML algorithm is used to optimize the results of ADF, unit root test which ensures the minimum of ACF, and PACF graphs intention of the data series. The results employ the fitted ARIMA models (1, 1, 1) and (1, 1, 7) for the 1st to 4th waves, confirming daily infected cases across the entire dataset of Pakistan. The cumulative trained observations are from the 1st wave (February 26, 2020, to October 20, 2020), 2nd wave (October 21, 2020, to March 16, 2021), 3rd wave (March 17, 2021, to July 10, 2021), and 4th wave (July 11, 2021, to September 30, 2021), with a further 14-day forecast (from October 1 to October 14, 2021). The results show a strong correlation between the trained and predicted values, ranging from 0.8789 to 0.99236. To select predictive model parameters, the model that results in the minimum Bayesian Information Criterion (BIC) value and residuals from the datasets obtained after detaching the unnecessary errors and the 95^% CI for the forecasting error ( ) are calculated. These values would help to decide the best fitted predictive model.
Read full abstract