Abstract

Recovering missing values plays a significant role in time series tasks in practical applications. How to replace the missing data and build the dependency relations from the incomplete sample set is still a challenge. The previous research has found that residual network (ResNet) helps to form a deep network and cope with degradation problem by shortcut connection. Gated recurrent unit (GRU) can improve network model and reduce training parameters by update gate which takes the place of forgetting gate and output gate in long short-term memory (LSTM). Inspired by this finding, we observe that shortcut connection and mean of global revealed information can model the relationship among missing items, the previous and overall revealed information. Hence, we design an imputation network with decay factor for shortcut connection and mean of the global revealed information in GRU, called decay residual mean imputation GRU (DRMI-GRU). We introduce a decay residual mean unit (DRMU), which takes full advantage of the previous and global revealed information to model incomplete time series; and the decay factor is applied to balance the previous long-term dependencies and all non-missing values in the sample set. In addition, a mask unit is designed to check the missing data existing or not. An extensive body of empirical comparisons with other existing imputation algorithms over real-world data and public dataset with different ratio of missing data verifies the performance of our model.

Highlights

  • The time series data is ubiquitous around our daily life such as medical records [14], sales profit, new user volume and others, which has led to the awareness of the importance of data analysis for this type of data

  • We address the problem of missing data in time series from the view of decay shortcut connection and global revealed information, and introduce a model for imputation based on Gated recurrent unit (GRU)

  • Zhang et al.: Time Series Imputation via Integration of Revealed Information Based on the Residual Shortcut Connection TABLE 2

Read more

Summary

Introduction

The time series data is ubiquitous around our daily life such as medical records [14], sales profit, new user volume and others, which has led to the awareness of the importance of data analysis for this type of data. The data mining is the main solution to do data analysis. The quality of collected data is a major factor that affects the data mining results. The better collected data quality, the better data mining results. In practice some of the time series data may be missing due to the incomplete collection of data, noise or the broken of devices [27]. The collection time interval of time series cannot catch the requirements.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.