Abstract

The graph neural network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This study proposes an ensemble model called Boosting-GNN, which uses GNNs as the base classifiers during boosting. In Boosting-GNN, higher weights are set for the training samples that are not correctly classified by the previous classifiers, thus achieving higher classification accuracy and better reliability. Besides, transfer learning is used to reduce computational cost and increase fitting ability. Experimental results indicate that the proposed Boosting-GNN model achieves better performance than graph convolutional network (GCN), GraphSAGE, graph attention network (GAT), simplifying graph convolutional networks (SGC), multi-scale graph convolution networks (N-GCN), and most advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average performance improvement of 4.5%.

Highlights

  • Convolutional neural networks (CNNs) have been widely used in image recognition (Russakovsky et al, 2015; He et al, 2016), object detection (Lin et al, 2014), speech recognition (Yu et al, 2016), visual coding and decoding (Huang et al, 2021a,b)

  • The N-graph convolutional network (GCN) obtains a feature representation of the nodes by convolving around the nodes at different scales and fusing all the convolution results, which can slightly improve the classification compared to the GCN

  • Resampling method and Reweighting method can improve the accuracy of graph neural network (GNN) on imbalanced datasets, but the improvement is very limited

Read more

Summary

Introduction

Convolutional neural networks (CNNs) have been widely used in image recognition (Russakovsky et al, 2015; He et al, 2016), object detection (Lin et al, 2014), speech recognition (Yu et al, 2016), visual coding and decoding (Huang et al, 2021a,b). Graph neural networks (GNNs) can effectively construct deep learning models on graphs. In addition to homogeneous graphs, heterogeneous GNN (Wang et al, 2019; Li et al, 2021; Peng et al, 2021) can effectively handle more comprehensive information and semantically richer heterogeneous graphs. The graph convolutional network (GCN) (Kipf and Welling, 2016) has achieved remarkable success in multiple graph data-related tasks, including recommendation systems (Chen et al, 2020; Yu and Qin, 2020), molecular recognition (Zitnik and Leskovec, 2017), traffic forecast (Bai et al, 2020), and point cloud segmentation (Li et al, 2019). GCN achieves superior performance in solving node classification problems compared with conventional methods, but it is adversely affected by datasets imbalance. Existing studies on GCNs all aim at balanced datasets, and the problem of imbalanced datasets have not been considered

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.