Abstract

Heterogeneous graphs are an essential structure that models real-world data through different types of nodes and relationships between them, including multimodality, which comprises different types of data such as text, image, and audio. Graph Neural Networks (GNNs) are a prominent graph representation learning method that takes advantage of the graph structure and its attributes that, when applied to the multimodal heterogeneous graph, learn a unique semantic space for the different modalities. Consequently, it allows multimodal fusion through simple operators such as sum, average, or multiplication, generating unified representations considering the supplementary and complementarity relationships between the modalities. In multimodal heterogeneous graphs, the labeling process tends to be even more costly due to the multiple modalities analyzed, in addition to the imbalance of classes inherent to some applications. In order to overcome these problems in applications that comprise a class of interest, One-Class Learning (OCL) is used. Given the lack of studies on multimodal early fusion in heterogeneous graphs for OCL tasks, we proposed a method based on unsupervised GNN for heterogeneous graphs and evaluated different early fusion operators. In this paper, we extend another work by evaluating the behavior of the main GNN convolutions in the method. We highlight that using operators such as average, addition, and subtraction were the best early fusion operators. In addition, GNN layers that do not use an attention mechanism performed better. In this way, we argue for heterogeneous graph neural networks in multimodal using early fusion simple operators instead of well-often-used concatenation and less complex convolutions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.