Comparison of Missing Data Imputation Methods in Time Series Forecasting

Hyun Ahn,Kyunghee Sun,Kwanghoon Pio Kim

doi:10.32604/cmc.2022.019369

Hyun Ahn, Kyunghee Sun + Show 1 more

Open Access

https://doi.org/10.32604/cmc.2022.019369

Copy DOI

Abstract

Time series forecasting has become an important aspect of data analysis and has many real-world applications. However, undesirable missing values are often encountered, which may adversely affect many forecasting tasks. In this study, we evaluate and compare the effects of imputation methods for estimating missing values in a time series. Our approach does not include a simulation to generate pseudo-missing data, but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom. In an experiment, therefore, several time series forecasting models are trained using different training datasets prepared using each imputation method. Subsequently, the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models. The results obtained from a total of four experimental cases show that the -nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.

Highlights

The recent emergence of cutting-edge computing technology such as the internet of things (IoT) and big data, has resulted in a new era in which large-scale data can be generated, collected, and exploited
This section introduces the concept of missing data imputation in the time series and the imputation methods used in the experiments
We evaluated the effects of imputation methods for replacing missing values with estimated values

Summary

Introduction

The recent emergence of cutting-edge computing technology such as the internet of things (IoT) and big data, has resulted in a new era in which large-scale data can be generated, collected, and exploited. CMC, 2022, vol., no.1 numerous missing values often coexist within such rich data. These missing values are considered as major obstacles in data analysis because they distort the statistical properties of the data and reduce availability. When obtaining data from a questionnaire, many respondents are likely to intentionally omit a response to a question that is difficult to answer. As another example, when collecting data measured by machines or computer systems, various types of missing values can occur owing to mechanical defects or system malfunctions. The primary types of missing values identified in previous studies related to the field of statistics are as follows:

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers, Materials & Continua	Publication Date: Jan 1, 2022
Citations: 19	License type: cc-by

R Discovery Prime

R Discovery Prime

Comparison of Missing Data Imputation Methods in Time Series Forecasting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers, Materials & Continua

Lead the way for us

Similar Papers

Chapter 12 - Time Series Forecasting
Vijay Kotu ... Bala Deshpande
Data Science | VOL. -
Vijay Kotu, et. al.Vijay Kotu ... Bala Deshpande
01 Jan 2019
Data Science | VOL. -

Assessing methods for multiple imputation of systematic missing data in marine fisheries time series with a new validation algorithm
Iván F Benavides ... John Josephraj Selvaraj
Aquaculture and Fisheries | VOL. 8
Iván F Benavides, et. al.Iván F Benavides ... John Josephraj Selvaraj
18 Feb 2022
Aquaculture and Fisheries | VOL. 8

Hybrid structures in time series modeling and forecasting: A review
Zahra Hajirahimi ... Mehdi Khashei
Engineering Applications of Artificial Intelligence | VOL. 86
Zahra Hajirahimi, et. al.Zahra Hajirahimi ... Mehdi Khashei
05 Sep 2019
Engineering Applications of Artificial Intelligence | VOL. 86

Attrition in longitudinal studies: How to deal with missing data
Jos Twisk ... Wieke De Vente
Journal of Clinical Epidemiology | VOL. 55
Jos Twisk, et. al.Jos Twisk ... Wieke De Vente
23 Mar 2002
Journal of Clinical Epidemiology | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison of Missing Data Imputation Methods in Time Series Forecasting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computers, Materials &amp; Continua

More From: Computers, Materials & Continua