A Two-Block RNN-Based Trajectory Prediction From Incomplete Trajectory

Ryo Fujii,Hideo Saito,Ryo Hachiuma,Jayakorn Vongkulbhisal

doi:10.1109/access.2021.3072135

Abstract

Trajectory prediction has gained great attention and significant progress has been made in recent years. However, most works rely on a key assumption that each video is successfully preprocessed by detection and tracking algorithms and the complete observed trajectory is always available. However, in complex real-world environments, we often encounter miss-detection of target agents (e.g., pedestrian, vehicles) caused by the bad image conditions, such as the occlusion by other agents. In this paper, we address the problem of trajectory prediction from incomplete observed trajectory due to miss-detection, where the observed trajectory includes several missing data points. We introduce a two-block RNN model that approximates the inference steps of the Bayesian filtering framework and seeks the optimal estimation of the hidden state when miss-detection occurs. The model uses two RNNs depending on the detection result. One RNN approximates the inference step of the Bayesian filter with the new measurement when the detection succeeds, while the other does the approximation when the detection fails. Our experiments show that the proposed model improves the prediction accuracy compared to the three baseline imputation methods on publicly available datasets: ETH and UCY (9% and 7% improvement on the ADE and FDE metrics). We also show that our proposed method can achieve better prediction compared to the baselines when there is no miss-detection.

Highlights

Predicting future trajectory from video data is an indispensable technology for developing navigation systems that can be used in several scenarios, such as self-driving vehicles, social robots, and navigation systems for blind people
We investigate the problem of trajectory prediction from incomplete observations due to miss-detection
We propose a two-block Recurrent Neural Networks (RNNs) that learns the inference step of Bayesian filters for trajectory prediction from incomplete observed trajectory due to miss-detection

Summary

Introduction

Predicting future trajectory from video data is an indispensable technology for developing navigation systems that can be used in several scenarios, such as self-driving vehicles, social robots, and navigation systems for blind people. The most common setting of trajectory prediction is the surveillance setting from a fixed camera, where the position of the agent (e.g., pedestrian, vehicles) is often treated as a single point [1], [2]. Many approaches to trajectory prediction forecast the future state of the target agent conditioned on the history of past states [7]–[10]. To obtain the history of a position of the agents, we need to detect where the agents are in the current and past time steps (e.g., object detection) and to establish object

Objectives

Methods

Results

Discussion

Conclusion