Abstract

Node classification for highly imbalanced graph data is challenging, with existing graph neural networks (GNNs) typically utilizing a balanced class distribution to learn node embeddings on graph data. However, when dealing with an imbalanced class distribution, they tend to bias the nodes of the majority classes while the nodes of the minority classes are under-represented. To overcome this challenge, this work introduces a novel GNN-based Imbalanced Node Classification Model (GNN-INCM) that is appropriate for class-imbalanced graph data, comprising two cooperative modules: Embedding Clustering-based Optimization (ECO) and Graph Reconstruction-based Optimization (GRO). ECO first employs a two-layer graph convolutional network (GCN) to obtain node embeddings and then performs clustering analysis to enhance the representative nature of the node embeddings and ease classification. Moreover, GRO employs an inner product decoder to reconstruct graph structure and minimize information loss. In particular, we design a hard sample strategy and integrate it into ECO and GRO to ensure that the embeddings of the hard nodes are correctly represented. Furthermore, we propose a Hard Sample-based Knowledge Distillation Method (HSKDM) to train multiple GNN-INCM models simultaneously and improve the overall classification performance. Experiments on three well-known class-imbalanced graph datasets demonstrate that GNN-INCM outperforms current state-of-the-art methods in node classification tasks and that HSKDM can substantially improve the overall classification performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.