Abstract

The prediction task is significantly challenged by the intricate scene information and motion variations present in spatiotemporal data. Existing prediction methods struggle to accurately forecast long-term outcomes, particularly for transient motions characterized by notable trends, such as hand lifts, jumps, or vehicle turns. To address these challenges, we introduce a Spatiotemporal Motion Prediction Network based on Multi-level Feature Disentanglement (FDPNet). The model delineates spatiotemporal prediction into two distinct stages: feature disentanglement and motion prediction. We first devise a Multi-level Feature Disentanglement (MFD) model to disentangle the multilayer features of motion within the temporal sequence, encompassing period, trend, and residual components. This disentanglement is based on disentangling spatiotemporal coupling, enabling the network to comprehensively grasp the genuine laws governing motion in the spatiotemporal evolution process. Second, to enhance the prediction accuracy of the network over extended periods, we introduce the Motion Differential Self-Attention LSTM unit (MDSA-LSTM). This unit employs differential operations to extract inter-frame motion trends, elevating the network's proficiency in capturing spatiotemporal correlations over long distances through an enhanced self-attention mechanism. FDPNet attains state-of-the-art performance on the Moving MNIST, UCF101, KITTI, and Caltech pedestrian datasets. These outcomes substantiate the substantial potential of this research within the realm of spatiotemporal prediction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.