Abstract
The lack of traffic data is a bottleneck restricting the development of Intelligent Transportation Systems (ITS). Most existing traffic data completion methods aim at low‐dimensional data, which cannot cope with high‐dimensional video data. Therefore, this paper proposes a traffic data complete generation adversarial network (TDC‐GAN) model to solve the problem of missing frames in traffic video. Based on the Feature Pyramid Network (FPN), we designed a multiscale semantic information extraction model, which employs a convolution mechanism to mine informative features from high‐dimensional data. Moreover, by constructing a discriminator model with global and local branch networks, the temporal and spatial information are captured to ensure the time‐space consistency of consecutive frames. Finally, the TDC‐GAN model performs single‐frame and multiframe completion experiments on the Caltech pedestrian dataset and KITTI dataset. The results show that the proposed model can complete the corresponding missing frames in the video sequences and achieve a good performance in quantitative comparative analysis.
Highlights
In recent years, with the rapid development in the field of Intelligent Transportation Systems (ITS), numerous data with rich traffic information attract the widespread attention of researchers [1,2,3,4]
Deep learning methods have become an important tool for video sequences modeling, especially the proposal of generative adversarial network (GAN), which is good at capturing complex features in high-dimensional data due to its outstanding learning capability
To verify the versatility of the proposed traffic data complete generation adversarial network (TDC-GAN) model, we verified it on the KITTI dataset [47], which contains real video data collected in scenes such as urban areas, rural areas, and highways, and each frame can contain up to 15 cars and 30 pedestrians
Summary
With the rapid development in the field of Intelligent Transportation Systems (ITS), numerous data with rich traffic information attract the widespread attention of researchers [1,2,3,4]. Most of the existing studies are carried out to complete low-dimensional data (e.g., traffic flow [13], travel time [14, 15], and trajectory [16]), which cannot cope with highdimensional traffic video data containing more intuitive information. This can be explained for two reasons. Traffic video scenes are relatively complex, which usually include a large number of vehicles and pedestrians This results in the difficulties of explicitly extracting semantic information in traffic scenes with low-dimensional traffic completion models
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.