Graph neural networks (GNNs) have achieved great success recently on graph classification tasks using supervised end-to-end training. Unfortunately, extensive noisy graph labels could exist in the real world because of the complicated processes of manual graph data annotations, which may significantly degrade the performance of GNNs. Therefore, we investigate the problem of graph classification with label noise, which is demanding because of the complex graph representation learning issue and serious memorization of noisy samples. In this work, we present a novel approach called S ubgra p h Set Netw or k with Sample Selection and Consis t ency Learning (SPORT) for this problem. To release the overfitting of GNNs, SPORT proposes to characterize each graph as a set of subgraphs generated by certain predefined stratagems, which can be viewed as samples from its underlying semantic distribution in graph space. Then we develop an equivariant network to encode the subgraph set with the consideration of the symmetry group. To further release the influences of noisy examples, we leverage the predictions of subgraphs to measure the likelihood of a sample being clean or noisy, followed by effective label updating. In addition, we propose a joint loss to advance the model generalizability by introducing consistency regularization. Comprehensive experiments on a wide range of graph classification datasets demonstrate the effectiveness of our SPORT. Specifically, SPORT outperforms the most competing baseline by up to 6.4%.
Read full abstract