Abstract

Pedestrian trajectory prediction in crowd scenes plays a significant role in intelligent transportation systems. The main challenges are manifested in learning motion patterns and addressing future uncertainty. Typically, trajectory prediction is considered in two dimensions, including temporal dynamics modeling and social interactions capturing. For temporal dependencies, although existing models based on recurrent neural networks (RNNs) or convolutional neural networks (CNNs) achieve high performance on short-term prediction, they still suffer from limited scalability for long sequences. For social interactions, previous graph-based methods only consider fixed features but ignore dynamic interactions between pedestrians. Considering that the transformer network has a strong capability of capturing spatial and long-term temporal dynamics, we propose Long-Short Term Spatio-Temporal Aggregation (LSSTA) network for human trajectory prediction. First, a modern variant of graph neural networks, named spatial encoder, is presented to characterize spatial interactions between pedestrians. Second, LSSTA utilizes a transformer network to handle long-term temporal dependencies and aggregates the spatial and temporal features with a temporal convolution network (TCN). Thus, TCN is combined with the transformer to form a long-short term temporal dependency encoder. Additionally, multi-modal prediction is an efficient way to address future uncertainty. Existing auto-encoder modules are extended with static scene information and future ground truth for multi-modal trajectory prediction. Experimental results on complex scenes demonstrate the superior performance of our method in comparison to existing approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.