Multivariate time series is ubiquitous in real-world applications, yet it often suffers from missing values that impede downstream analytical tasks. In this paper, we introduce the Long Short-Term Memory Network based Recurrent Autoencoder with Imputation Units and Temporal Attention Imputation Model (RATAI), tailored for multivariate time series. RATAI is designed to address certain limitations of traditional RNN-based imputation methods, which often focus on predictive modeling to estimate missing values, sometimes neglecting the contextual impact of observed data at and beyond the target time step. Drawing inspiration from Kalman smoothing, which effectively integrates past and future information to refine state estimations, RATAI aims to extract feature representations from time series data and use them to reconstruct a complete time series, thus overcoming the shortcomings of existing approaches. It employs a dual-stage imputation process: the encoder utilizes temporal information and attribute correlations to predict and impute missing values, and extract feature representation of imputed time series. Subsequently, the decoder reconstructs the series from the feature representation, and the reconstructed values are used as the final imputation values. Additionally, RATAI incorporates a temporal attention mechanism, allowing the decoder to focus on highly relevant inputs during reconstruction. This model can be trained directly using data that contains missing values, avoiding the misleading effects on model training that can arise from setting initial values for missing values. Our experiments demonstrate that RATAI outperforms benchmark models in multivariate time series imputation.
Read full abstract