Abstract

Visual behavior intention inference is crucial for enabling escort robots to interact naturally with humans, which is very challenging due to the big inner-class similarity and the small intra-class distinguishability of successive actions in the assistive scenario. To attain a reliable behavior intention inference, not only the current state of behaviors is concerned, but also the semantic information in both spatial and temporal domains plays an important role. This paper presents a segmentation–detection–recognition hierarchical system to represent the spatio-temporal semantic features for formulating descriptions of body parts, trajectories and deep relationships of sub-behaviors. Specifically, a dense trajectory matching scheme based on temporal sampling and Binarized Normed Gradients (BING) algorithm is formulated to segment the 3-Dimensional (3D) behavior cubes, based on which, local trajectories are obtained by clustering dense trajectories according to the distance similarity, and the body parts are then detected by multi-kernel learning of the encoded local features. Moreover, a global three-stream context Convolutional Neural Networks (CNN) is proposed for behavior classification by designing a texture module using expansion, connection and 1D convolution implementations. Based on transfer learning, scene information is also recognized efficiently. Finally, the semantic descriptors are modeled by two cascaded And-Or Graphs (AoGs) constraining the spatial scenarios and temporal sequences. Our unified approach is demonstrated on two public benchmarks containing long-term activities and on an escort robot for real-world applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.