Abstract
Human Object Interaction (HOI) detection can provide valuable insights into the meaning and interpretation of a painting, as the interactions between humans and objects can reveal information about the scene, characters, and story depicted in the artwork. Automatically detecting HOI in paintings is a challenging task, as the paintings often contain complex scenes with intricate details and variations in artistic style. Additionally, unlike in real-world images, the context and physics of the painting may not follow physical rules, which can further complicate the detection process. This paper proposes a novel system for detecting HOIs in paintings using multi-task learning. The system utilizes an object detection model to detect instances of human figures and objects, and extracts from them visual and spatial features. The appearance features are then combined to produce an optimized model for detecting HOIs. In order to enhance our model's performance on HOI detection, we train it in a multi-task learning setting with four different tasks. This approach allows us to leverage shared representations across multiple tasks, leading to improved accuracy and efficiency of HOI detection in our system. To train and test our model, we introduce a new benchmark for HOI detection in paintings, by augmenting the existing SemArt dataset with instance detection annotations and interaction classes and call it SemArt-HOI. Through our experiments, we show that our model is able to outperform the state-of-the-art one-stage transformer-based HOI detection model in both single-task and multi-task settings. Furthermore, our system's superior efficiency, training four times faster than the state-of-the-art model and using fewer resources, makes it ideal for practical and large-scale HOI detection in paintings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Digital Applications in Archaeology and Cultural Heritage
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.