Abstract

Air pollution is one of the most severe problems facing the world. Research on air quality prediction and analysis of influencing factors also continues to grow. When conducting this research, valid, authentic, and high-quality air pollution data are necessary to obtain reasonable results. However, Missing values are unavoidable in multivariate time series due to multiple causes, such as sensor and communication failure. Most previous algorithms on missing data cannot effectively pay attention to air pollution’s temporal and spatial mechanism, handle multiple missing patterns, or deal with high missing rates sequences. This paper proposes a new deep spatiotemporal imputation methodology to address this problem effectively, namely transferred Multiple LSTM based deep auto-encoder (TMLSTM-AE). Our idea is intuitive: train an auto-encoder to estimate the missing values. It uses spatial and time series information to fill in single missing, multiple missing, block missing, and long-interval consecutive missing in air quality data. To verify the effectiveness and priority of the proposed model, we conducted a case study in a city in Shaanxi, China. Long-interval consecutive missing and different missing patterns PM2.5 data are filled. The results indicate that the model proposed in this paper performs well and outperforms existing models for different missing patterns and long-interval consecutive missing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call