IGANI: Iterative Generative Adversarial Networks for Imputation With Application to Traffic Data

Amir Kazemi,Hadi Meidani

doi:10.1109/access.2021.3103456

Abstract

Increasing use of sensor data in intelligent transportation systems calls for accurate imputation algorithms that can enable reliable traffic management in the occasional absence of data. As one of the effective imputation approaches, generative adversarial networks (GANs) are implicit generative models that can be used for data imputation, which is formulated as an unsupervised learning problem. This work introduces a novel iterative GAN architecture, called Iterative Generative Adversarial Networks for Imputation (IGANI), for data imputation. IGANI imputes data in two steps and maintains the invertibility of the generative imputer, which will be shown to be a sufficient condition for the convergence of the proposed GAN-based imputation. The performance of our proposed method is evaluated on (1) the imputation of traffic speed data collected in the city of Guangzhou in China, and the training of short-term traffic prediction models using imputed data, and (2) the imputation of multi-variable traffic data of highways in Portland-Vancouver metropolitan region which includes volume, occupancy, and speed with different missing rates for each of them. It is shown that our proposed algorithm mostly produces more accurate results compared to those of previous GAN-based imputation architectures.

Highlights

T RAFFIC data is generated faster than before as intelligent transportation systems (ITS) develop
In this work, a new Generative Adversarial Networks (GANs) architecture, named Iterative Generative Adversarial Networks for Imputation (IGANI), is introduced for data imputation and its performance is evaluated on imputation of missing traffic data and short-term traffic prediction
It is shown that IGANI significantly outperforms the previous GAN-based imputation architectures accuracy

Summary

Introduction

T RAFFIC data is generated faster than before as intelligent transportation systems (ITS) develop. Explicit generative models considers the data to follow a density function pθ(x) whose parameters θ are estimated through maximum likelihood method using the log-likelihood function log pθ(x). Selecting computationally tractable densities is an important step in explicit generative models, while the choice of a density function that is capable of capturing data complexity is not straightforward. A typical example is variational autoencoders (VAE) which uses a tractable lower bound for an intractable log-likelihood [26]. Another option is Boltzman machines which is based on Markov Chain Monte Carlo [27]. Boltzman machines simulate a sequence of samples x ∼ q(x |x) where q is a transition probability density designed in such a way that the distribution of the samples will converge to p(x). While variational methods like VAEs are affected by the accuracy of the posterior or prior distributions, MCMC methods, such as Boltzman machines, suffer from slow convergence [21]

Objectives

Methods

Findings

Conclusion