Abstract

Human-object interaction (HOIs) detection is a new branch of visual relationship detection, which plays an important role in the field of image understanding. Because of the complexity and diversity of image content, the detection of HOIs is still an onerous challenge. Unlike most of the current works for HOIs detection which only rely on the pairwise information of a human and an object, we propose a graph-based HOIs detection method that models context and global structure information. Firstly, to better utilize the relations between humans and objects, the detected humans and objects are regarded as nodes to construct a fully connected undirected graph, and the graph is pruned to obtain an HOI graph that only preserving the edges connecting human and object nodes. Then, in order to obtain more robust features of human and object nodes, two different attention-based feature extraction networks are proposed, which model global and local contexts respectively. Finally, the graph attention network is introduced to pass messages between different nodes in the HOI graph iteratively, and detect the potential HOIs. Experiments on V-COCO and HICO-DET datasets verify the effectiveness of the proposed method, and show that it is superior to many existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call