Abstract

In our lives, the interactions between humans and objects around them are happening all the time. The purpose of human-object interaction(HOI) detection is to locate humans and objects in a visual scene and infer the type of interaction between the two. Most of the HOI detection works infer the interaction type from two aspects: spatial features and visual features. We adopt the same strategy as them, but in the stage of extracting spatial features, we believe that the relationships between human body parts and objects are very important. We choose distance as a standard to measure the relationship between body parts and objects, and add it to the process of extracting spatial features, then use the refined spatial features to further extract visual features. Our experimental results on the V-COCO dataset show that our new model based on distance is effective. Compared with other methods, the accuracy of our model is significantly improved.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.