Abstract

Human-object interaction (HOI) detection is the detection of a human’s relationship with an object in still images and videos. The majority of HOI detection methods rely on appearance features as the primary feature for detecting the relationship between humans and objects. Furthermore, the model’s performance is affected by the abundance of false-positive pairs generated by the image’s non-interactive human-object pairs and human-object mis-grouping. In this paper, we propose "Spatial-Net", a new HOI detection approach in still images. In the proposed approach, the HOI problem is divided into two main tasks, namely pair-prediction and global-rejection. In the pair-prediction task, the spatial relationship is adopted to predict the human-object interaction for each human-object pair using spatial features that contains spatial map which is a single channel image that represents human-object pairs including body parts and object masks, relative geometry features such as relative size, relative distance, and intersection-over-union between body part and objects, and weighted distance that is used as body part attention deterministic model. In the global-rejection task, an augmented model is employed to reject false positive pairs. We use the Hungarian matching technique to assign human-object pairs for each action and human-centric model to reject the non-interaction human-object pairs according to semantic co-occurrence between human and object. The experimental results on the V-COCO dataset demonstrate that the proposed Spatial-Net outperforms many state-of-the-art HOI models with less inference time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.