Abstract

Ideally, we think data are carefully collected and have regular patterns with no missing values, but in reality, this does not always happen. This study examines four (4) methods—mean imputation (MI), median imputation (MDI), linear imputation (LI) and Kalman filter algorithm (KAL)—of estimating missing values in time series. The study utilized pairs of nine (9) simulated series; each pair constitutes “actual series” and “12% missingness series”. The three (3) sample sizes i.e. small (50), medium (200) and large (1000) were varied over the additive models linear, quadratic and exponential forms of trend. The 12% missingness series were estimated using MI, MDI, LI and KAL. The performances of the method were checked using the root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE), while the overall performances of the estimating methods were accessed using the average of the accuracy measures (RMSE, MAE and MAPE). The results of the average-accuracy measures show that KAL outperformed other methods (MI, MDI and LI) at the three sample sizes when the trend was linear; also, MDI outperformed other methods at the three (3) sample sizes when the trend was exponential. Furthermore, MI outperformed others at small and large sample sizes when the trend was quadratic. However, the Kalman filter algorithm proved better when the sample size was medium. Hence, KAL, MI and MDI methods are recommended to estimate missing data in time series when the trend is linear, quadratic and exponential respectively, until further study proves otherwise.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call