Abstract

The class imbalance problem is a significant challenge in node classification tasks. Since majority class samples dominate imbalanced data, the model tends to favor the majority class, resulting in insufficient ability to identify minority classes. Evaluation indicators such as accuracy may not fully reflect the model’s performance. To solve these undesirable effects, we propose a framework for synthesizing minority class samples, GraphSHX, to balance the number of samples of different classes, and integrate the XGBoost model for node classification prediction during the training process. Conventional graph neural networks (GNNs) yielded unsatisfactory results, possibly due to the limited number of newly generated nodes. Therefore, we introduce a meta-mechanism to deal with small-sample problems, and employ the meta-learning approach to enhance performance on small-sample tasks by learning from a large number of tasks. An empirical evaluation of node classification on six publicly available datasets demonstrated that our balanced data set method outperforms existing optimal loss repair methods and synthetic node methods. The addition of the XGBoost model and meta-learning improves the accuracy by more than 5% to 10%, with the overall accuracy of the improved model being 15% higher than that of the baseline method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.