Abstract

In the field of vision research, it is considered that the analysis of scene graphs can help machines comprehend higher-level visual scene contexts. The data structure that both detects objects and defines their relationships is called a scene graph. The scene graph has been applied to diverse tasks, including action recognition, image captioning and visual question answering. It is also suggested as a method for understanding the video context. The amount of video data needed for training to produce a scene graph is nonetheless insufficient. Therefore, the inference model was applied to real-world data, and the possibility of usage of scene graph as a visual feature was identified.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call