Abstract
Human intention prediction is of great significance in many applications, such as human-robot interaction, intelligent rehabilitation robots. This paper studies the problem of short-term next-active-object prediction in egocentric images. The short-term next-active-object refers to the object that a human is going to interact with in the short-term future, which is an embodiment of human intention. Most current methods usually use object-centered cues, such as the deviation of object appearance change and the unique shape of the egocentric object trajectory, to predict the next-active-object. In this paper, inspired by the fact that human intention is also revealed by human-centered cues, we propose a deep neural network model that integrates the cues from visual attention and hand positions to predict the next-active-object. Firstly, the probability maps of visual attention and hand positions are constructed, and then the probability distribution of next-active-object is generated. We experimentally compare our method with several baseline methods using two datasets and confirm its effectiveness. In addition, ablation experiments are conducted, and crucial points concerning the next-active-object are discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.