Abstract

Multi-person action forecasting is an emerging topic in the computer vision field, and it is a pivotal step toward video understanding at a semantic level. This task is difficult due to the complexity of spatial and temporal dependencies. Yet, the state- of-the-art literature does not seem to be adequately responsive to this challenge. Hence, how to better foresee the forthcoming actions per actor has to be further pursued. Toward this end, we put forth a novel RElational Spatio-TEmPoral learning approach (RESTEP) for multi-person action forecasting. Our RESTEP explores the key that inherently characterizes actions from a perspective of incorporating the spatial and temporal information in a single pass (spatio-temporal dependencies) by extending rela- tional reasoning. As a result, the RESTEP enables simultaneously predicting the actions of all actors in the scene. Our proposal significantly differs from mainstream works that heavily rely on independently processing the spatial and temporal dependencies. The proposed RESTEP first perceives a graph building upon the historical observations, then reasons the relational spatio- temporal context to extrapolate future actions. In order to augment the comprehension of individual actions that might vary over time, we further delve deeper into the essence behind this point the evolution of spatio-temporal dependencies via optimizing the corresponding mutual information. We assess the RESTEP method on the large-scale Atomic Visual Actions (AVA) dataset, Activities in Extended Videos (ActEV/VIRAT) dataset and Joint-annotated Human Motion Data Base (J-HMDB). The experimental outcomes reveal that RESTEP can introduce con- siderable improvements with respect to recent leading studies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.