How do loss functions impact the performance of graph neural networks?

Gabriel Jonas Duarte,Tamara Arruda Pereira,Amauri Holanda Souza Junior,Diego Mesquita,Erik Jhones Nascimento

doi:10.21528/cbic2021-161

Abstract

Graph neural networks (GNNs) have become the de facto approach for supervised learning on graph data.To train these networks, most practitioners employ the categorical cross-entropy (CE) loss. We can attribute this largely to the probabilistic interpretability of models trained using CE, since it corresponds to the negative log of the categorical/softmax likelihood.We can attribute this largely to the probabilistic interpretation of CE, since it corresponds to the negative log of the categorical/softmax likelihood.Nonetheless, recent works have shown that deep learning models can benefit from adopting other loss functions. For instance, neural networks trained with symmetric losses (e.g., mean absolute error) are robust to label noise. Nonetheless, loss functions are a modeling choice and other training criteria can be employed — e.g., hinge loss and mean absolute error (MAE). Perhaps surprisingly, the effect of using different losses on GNNs has not been explored. In this preliminary work, we gauge the impact of different loss functions to the performance of GNNs for node classification under i) noisy labels and ii) different sample sizes. In contrast to findings on Euclidean domains, our results for GNNs show that there is no significant difference between models trained with CE and other classical loss functions on both aforementioned scenarios.

Full Text