Recent advancements in node and graph classification tasks can be attributed to the implementation of contrastive learning and similarity search. Despite considerable progress, these approaches present challenges. The integration of similarity search introduces an additional layer of complexity to the model. At the same time, applying contrastive learning to non-transferable domains or out-of-domain datasets results in less competitive outcomes. In this work, we propose maintaining domain specificity for these tasks, which has demonstrated the potential to improve performance by eliminating the need for additional similarity searches. We adopt a fraction of domain-specific datasets for pre-training purposes, generating augmented pairs that retain structural similarity to the original graph, thereby broadening the number of views. This strategy involves a comprehensive exploration of optimal augmentations to devise multi-view embeddings. An evaluation protocol, which focuses on error minimization, accuracy enhancement, and overfitting prevention, guides this process to learn inherent, transferable structural representations that span diverse datasets. We combine pre-trained embeddings and the source graph as a beneficial input, leveraging local and global graph information to enrich downstream tasks. Furthermore, to maximize the utility of negative samples in contrastive learning, we extend the training mechanism during the pre-training stage. Our method consistently outperforms comparative baseline approaches in comprehensive experiments conducted on benchmark graph datasets of varying sizes and characteristics, establishing new state-of-the-art results.
Read full abstract