Class-homophilic-based data augmentation for improving graph neural networks

Rui Duan,Chungang Yan,Junli Wang,Changjun Jiang

doi:10.1016/j.knosys.2023.110518

Abstract

Data augmentation has been shown to improve graph neural networks (GNNs). Existing graph data augmentation is achieved by adding or removing edges or changing the input node features due to graph data’s complexity and non-Euclidean nature. However, the graphs generated by the above augmentation operations have similar or identical structures as the original graph, which leads to similar or identical structural information learned by GNNs from the original and augmented graphs. Two problems arise from this: restricted information and low applicability. To solve these problems, we propose Class-hOmophilic-based Data Augmentation (CODA), which improves existing GNNs by helping them learn adequate and extra structural class information, which is lacking in the original graph, and promotes GNNs’ application to graphs with a large number of interclass edges. We first pretrain the GNNs and then design a new augmentation method that generates an approximate class-homophilic graph according to the pretrained GNNs. In the end, we design learnable node-level self-attention mechanisms with telescopic coefficients, which result in GNNs integrating the structural information of the two graphs in a more principled way to break the constraint of pretrained GNNs. Extensive experiments on various datasets show that augmentation via CODA improves performance and applicability across GNN architectures. The source code of CODA is publicly available at https://github.com/graphNN/CODA.

Full Text