Evaluating Different Node Feature Extraction Methods for Graph Coloring Problem with Graph Neural Network
The graph coloring problem (GCP) is a classical NP-hard problem that aims to assign different colors to adjacent nodes while minimizing the total number of colors used. While previous studies have used graph neural networks (GNNs) to solve GCP, they rely only on randomly initialized node features or a trainable embedding layer, leaving other alternative node feature extraction methods unexplored. Therefore, this study explores eight node feature extraction methods, including positional and structural node features. We assess their impact on GNN performance for GCP and provide insights into why certain methods outperform others. Across 12 COLOR graphs and 3 large citation graphs, experimental results show that both the trainable embedding layer and node2vec achieve the strongest performance. Under different hyperparameter settings, embedding layer demonstrates consistent effectiveness in minimizing conflicts across GNN architectures, while node2vec demonstrates greater average performance and stability on large graphs. Compared to the embedding layer baseline, node2vec reduces mean conflicts by 43.63% and standard deviation by 27.23% on large citation graphs. However, their performance gains involve a computational trade-off: embedding layer requires backpropagation, and node2vec requires pre-computation for its biased random walks. Furthermore, positional node features give better prediction performance than structural ones, having approximately 9.3× lower mean conflicts in the best case.
- Conference Article
- 10.1109/ickg55886.2022.00009
- Nov 1, 2022
Graph Neural Networks (GNNs) require that all nodes have initial representations which are usually derived from the node features. When the node features are absent, GNNs can learn node embeddings with an embedding layer or use pre-trained network embeddings for the initial node representations. However, these approaches are limited because i) they cannot be easily extended to initialize new nodes that are added to the graph for inference after training and ii) they are memory intensive and store a fixed representation for every node in the graph. In this work, we present PropInit a scalable node representation initialization method for training GNNs and other Graph Machine Learning (ML) models on heterogeneous graphs where some or all node types have no natural features. Unlike existing methods that learn a fixed embedding vector for each node, PropInit learns an inductive function that leverages the metagraph to initialize node representations. As a result, PropInit is fully inductive and can be applied, without retraining, to new nodes without features that are added to the graph. PropInit also scales to large graphs as it requires only a small fraction of the memory requirements of existing methods. On public benchmark heterogeneous graph datasets, using various GNN models, PropInit achieves comparable or better performance to other competing approaches while needing only 0.01% to 2% of their memory consumption for representing node embeddings. We also demonstrate PropInit's effectiveness on an industry heterogeneous graph dataset for fraud detection and achieve better classification accuracy than learning full embeddings while reducing the embedding memory footprint during training and inference by 99.99%
- Conference Article
- 10.1109/bcd54882.2022.9900611
- Aug 4, 2022
We investigate works under the propagation-based fake news detection domain, which recently seeks to improve performance through the use of Graph Neural Networks (GNNs). Generally, existing works argue that using GNNs can give results superior to what was obtained using classic graph-based methods. We agree with this argument given that GNNs are capable of gaining superior performance by leveraging node features. But we argue that existing works haven't identified the fact that the expressivity of GNNs is limited and bounded by node features. Existing works do not acknowledge that, by utilizing GNNs, they implicitly assume node features are strongly correlated to node labels. There are evidence that node features that have been employed do not necessarily correlate to node label. Instead of having a profound theoretical motivation, they have empirically observed that focusing on nodes features with strong feature-label correlation can increase predictive capability. This is a sub-optimal approach to view this problem, in fact, we argue that finding node features based on correlation is not practical or effective. Our first contribution is shifting readers from a node-level view i.e correlating node features with labels, to a graph-level view. In the graph-level view, we exploit the relationship between graph isomorphism and GNNs' expressivity which can be utilized to well understand and interpret the relation between node features and GNNs' expressivity. We conduct a wide range of experiments on basis of both node-level view and graph-level view and found graph-level view is more interpretable and strongly matches with results. Further, we gained insights on node features that wouldn't be obtainable by a node-level view. In order to have a fair and comprehensive analysis of node features, we built a unified dataset that includes a wide range of node features. Our results indicate, as we improve model accuracy on basis of the graph level view, models' generalizability decreases. We provide our hypothesis for this performance trade-off on the basis of the graph-level view. Our results and insights call for a much broader discussion on whether any sort of filtering method is effective. So, we conclude our work by providing readers with possible solutions that can be helpful to find harmony between node features and GNNs' expressivity.
- Conference Article
8
- 10.1109/icde53745.2022.00013
- May 1, 2022
Graph Neural Networks (GNNs) have been success-fully applied to a variety of graph analysis tasks. Some recent studies have demonstrated that decoupling neighbor aggregation and feature transformation helps to scale GNNs to large graphs. However, very large graphs, with billions of nodes and millions of features, are still beyond the capacity of most existing GNNs. In addition, when we are only interested in a small number of nodes (called target nodes) in a large graph, it is inefficient to use the existing GNNs to infer the labels of these few target nodes. The reason is that they need to propagate and aggregate either node features or predicted labels over the whole graph, which incurs high additional costs relative to the few target nodes. To solve the above challenges, in this paper we propose a novel scalable and effective GNN framework COSAL. In COSAL, we substitute the expensive aggregation with an efficient proximate node selection mechanism, which picks out the most important <tex>$K$</tex> nodes for each target node according to the graph topology. We further propose a fine-grained neighbor importance quantification strategy to enhance the expressive power of COSAL. Empirical results demonstrate that our COSAL achieves superior performance in accuracy, training speed, and partial inference efficiency. Remarkably, in terms of node classification accuracy, our model COSAL outperforms baselines by significant margins of 2.22%, 2.23%, and 3.95% on large graph datasets Amazon2M, MAG-Scholar-C, and ogbn-papers100M, respectively.<sup>1</sup><sup>1</sup> Code available at https://github.com/joyce-x/COSAL.
- Research Article
5
- 10.1145/3709738
- Feb 10, 2025
- Proceedings of the ACM on Management of Data
Graph neural networks (GNNs) are models specialized for graph data and widely used in applications. To train GNNs on large graphs that exceed CPU memory, several systems have been designed to store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when conducting random reads for node features that are smaller than a disk page, or degraded model accuracy by treating the graph as disconnected partitions. To close this gap, we build DiskGNN for high I/O efficiency and fast training without model accuracy degradation. The key technique is offline sampling , which decouples graph sampling from model computation . In particular, by conducting graph sampling beforehand for multiple mini-batches, DiskGNN acquires the node features that will be accessed during model computation and conducts pre-processing to pack the node features of each mini-batch contiguously on disk to avoid read amplification for computation. Given the feature access information acquired by offline sampling, DiskGNN also adopts designs including four-level feature store to fully utilize the memory hierarchy of GPU and CPU to cache hot node features and reduce disk access, batched packing to accelerate feature packing during pre-processing, and pipelined training to overlap disk access with other operations. We compare DiskGNN with state-of-the-art out-of-core GNN training systems. The results show that DiskGNN has more than 8x speedup over existing systems while matching their best model accuracy. DiskGNN is open-source at https://github.com/Liu-rj/DiskGNN.
- Dissertation
- 10.32657/10356/182340
- Jan 1, 2024
Graph representation learning distills the complex structures of graphs into tractable, low-dimensional vector spaces, capturing essential topological and attribute-based properties. Graph Neural Networks (GNNs) have become a pivotal tool in this domain, leveraging graph structures to iteratively update node representations through neighbor aggregations. These representations support fundamental tasks such as node classification, link prediction, and graph classification, applicable across diverse fields from social networks and biological systems to citation networks. Despite their success, GNNs face critical challenges: they often underperform on heterophilic graph data where connected nodes display dissimilar characteristics, suffer from oversmoothing which impairs performance as network depth increases, and are sensitive to hierarchical structures. Furthermore, they are vulnerable to adversarial attacks that can severely compromise model integrity. This thesis introduces the use of neural differential equations in GNNs to enhance representation learning and robustness, addressing these challenges comprehensively. The adoption of Graph Neural Differential Equation Networks (GDENs) employs a dynamic systems approach to evolve node features over continuous time, thereby enhancing the capacity of GNNs to process and learn from graph-structured data. This method governs node feature propagation through differential equations, enabling more refined control over the learning process compared to conventional methods. The initial contribution of this thesis enhances representation learning on heterophilic graphs through a neural convection-diffusion differential equation. Subsequently, the thesis explores the relationship between stability in dynamical systems and robustness within GDENs. A neural Hamiltonian differential equation model is developed, establishing energy-conservative systems within GNNs to bolster robustness against adversarial attacks. Extending beyond traditional integer-order differential equations, the thesis incorporates fractional calculus through the Fractional-Order Graph Neural Differential Equation Networks (F-GDENs) framework. This approach introduces memory and non-local interactions, boosting the networks' ability to handle hierarchical structures and mitigate oversmoothing. F-GDENs not only integrate seamlessly with existing GDENs to enhance representation learning across various datasets, but also demonstrate tighter output perturbation bounds in scenarios involving input and topology perturbations. Empirical results further validate the superior robustness of F-GDENs models compared to integer-order GDENs. In summary, this thesis advances the robustness and capacity of representation learning through GDENs by innovating with new differential equations and extending to fractional-order derivatives. These advancements establish a solid foundation for future research into robust and adaptive GNN architectures, presenting promising implications for practical applications.
- Research Article
44
- 10.1609/aaai.v37i4.25553
- Jun 26, 2023
- Proceedings of the AAAI Conference on Artificial Intelligence
Graph Neural Networks (GNNs) have been a prevailing technique for tackling various analysis tasks on graph data. A key premise for the remarkable performance of GNNs relies on complete and trustworthy initial graph descriptions (i.e., node features and graph structure), which is often not satisfied since real-world graphs are often incomplete due to various unavoidable factors. In particular, GNNs face greater challenges when both node features and graph structure are incomplete at the same time. The existing methods either focus on feature completion or structure completion. They usually rely on the matching relationship between features and structure, or employ joint learning of node representation and feature (or structure) completion in the hope of achieving mutual benefit. However, recent studies confirm that the mutual interference between features and structure leads to the degradation of GNN performance. When both features and structure are incomplete, the mismatch between features and structure caused by the missing randomness exacerbates the interference between the two, which may trigger incorrect completions that negatively affect node representation. To this end, in this paper we propose a general GNN framework based on teacher-student distillation to improve the performance of GNNs on incomplete graphs, namely T2-GNN. To avoid the interference between features and structure, we separately design feature-level and structure-level teacher models to provide targeted guidance for student model (base GNNs, such as GCN) through distillation. Then we design two personalized methods to obtain well-trained feature and structure teachers. To ensure that the knowledge of the teacher model is comprehensively and effectively distilled to the student model, we further propose a dual distillation mode to enable the student to acquire as much expert knowledge as possible. Extensive experiments on eight benchmark datasets demonstrate the effectiveness and robustness of the new framework on graphs with incomplete features and structure.
- Conference Article
2
- 10.1109/iccc51575.2020.9345090
- Dec 11, 2020
Graph neural networks have been paid a lot attentions in recent years since many real-word data can naturally be represented by graph structures. Graph neural networks such as GCNs and GATs mainly focus on node features while ignoring edge features in graphs. However, in many graph structure data such as knowledge graphs, social networks, edge features are also important as they contain vital information about relations between nodes which are commonly ignored or simplified into binary or scalar values by existing methods. In this work we build a novel learning method on graphs called GNN-EE, i.e. GNN with Edge Enhanced, which takes both the node features and edge features into account while updating representations of graph components and can be applied to most of the common graph neural networks such as GCNs and GATs. Our GNN-EE method fits in the message-passing framework and thus is easy to generalize. In addition, we extend the random-walk-based algorithms on graphs so that they can consider both node and edge features on graphs. We use those random-walk-based algorithms as a pre-training method on graph with few initial features. We demonstrate the effectiveness and flexibility of our GNN-EE method through entity classification tasks and graph classification tasks.
- Research Article
37
- 10.1016/j.cose.2023.103285
- May 2, 2023
- Computers & Security
NE-GConv: A lightweight node edge graph convolutional network for intrusion detection
- Research Article
23
- 10.1016/j.eswa.2021.114655
- Feb 4, 2021
- Expert Systems with Applications
Node classification using kernel propagation in graph neural networks
- Research Article
10
- 10.1016/j.knosys.2022.108616
- Mar 25, 2022
- Knowledge-Based Systems
Graph Enhanced Neural Interaction Model for recommendation
- Conference Article
2
- 10.1109/bigdata50022.2020.9378189
- Dec 10, 2020
in recent years, graph neural network has been widely used. Attention mechanism is introduced into the graph neural network to make it more applicable. Both GAT and AGNN prove that attention mechanism plays an important role in graph neural network. Attention mechanism algorithms such as gat and AGNN directly use a self-learning variable to do the point product after calculating the connection (or similarity calculation) of node and neighbor features (without further processing of the calculation results). Finally, we get an aggregation of neighbor information. A cosine similarity distance pruning algorithm based on graph attention mechanism (CDP-GA) is proposed to optimize the attention matrix of nodes and their adjacent nodes. By calculating the cosine similarity between node features and neighbor features (the feature here is obtained by linear transformation), the similarity of nodes is regarded as the distance between nodes (or the weight of edges). And we think that the aggregation degree of node information is inversely proportional to the distance between nodes (similar to the heat conduction formula). In the method, we prune the neighborhood of the node according to the cosine similarity to get the final attention coefficient matrix. In this way, the attention mechanism in the graph neural network is further refined, and the loss of aggregation neighbor information is reduced. In the experiments of three datasets, our model is compared with the experimental classification of GAT and AGNN and the experiment of correlation graph neural network algorithm. Finally, it is proved that the algorithm is better than three known datasets.
- Research Article
1
- 10.1109/tnnls.2025.3577702
- Oct 1, 2025
- IEEE transactions on neural networks and learning systems
Training graph neural networks (GNNs) on large graphs is challenging due to both the high memory and computational costs of end-to-end training and the scarcity of detailed node-level annotations. To address these challenges, we propose layer-wise regularized graph infomax (LRGI), a self-supervised learning algorithm inspired by predictive coding, a biologically motivated principle in which each layer is trained locally to predict its future inputs. LRGI trains GNNs layer by layer, decoupling their memory and time complexity from the network depth, thereby enabling scalable training on large graphs. In LRGI, each layer learns to predict the features propagated from its neighbors, allowing independent training of each layer. This approach, combined with regularization that promotes diverse representations, also helps mitigate oversmoothing in deep GNNs. Experiments on large inductive graph benchmarks demonstrate that LRGI achieves competitive performance compared to state-of-the-art end-to-end methods, while substantially improving efficiency.
- Research Article
- 10.1007/s10994-025-06815-z
- Jun 24, 2025
- Machine Learning
Graph neural networks (GNNs) are among the most widely used methods for node classification in graphs. A common strategy to improve their predictive performance is to enrich nodes with additional features. A weakness of this method is that the set of appropriate features can vary from graph to graph. We address this shortcoming by proposing a novel method. In a preprocessing step, a first GNN is trained on a set of graphs with varying structural properties, using a candidate set of node features fixed in advance. The resulting GNN model is then used to predict the most relevant features from the candidate set for unseen target graphs, which are later processed for node classification. For each target graph, a second GNN is trained on the graph, which is enriched with the node feature vectors calculated for the features selected by the first GNN. A key advantage of the proposed method is that the features are selected without computing the candidate features for the target graph. Our experimental results on synthetic and real-world graphs show that even a few features selected in this way is sufficient to significantly improve the predictive performance of GNNs that use either none or all of the candidate features. Moreover, the time needed to learn the second GNN for the target graph can be reduced by up to two orders of magnitude.
- Conference Article
8
- 10.1109/icassp43922.2022.9747865
- May 23, 2022
Graph Neural Networks (GNNs) show impressive performance in many practical scenarios, which can be largely attributed to their stability properties. Empirically, GNNs can scale well on large size graphs, but this is contradicted by the fact that existing stability bounds grow with the number of nodes. Graphs with well-defined limits can be seen as samples from manifolds. Hence, in this paper, we analyze the stability properties of convolutional neural networks on manifolds to understand the stability of GNNs on large graphs. Specifically, we focus on stability to relative perturbations of the Laplace-Beltrami operator. To start, we construct frequency ratio threshold filters which separate the infinite-dimensional spectrum of the Laplace-Beltrami operator. We then prove that manifold neural networks composed of these filters are stable to relative operator perturbations. As a product of this analysis, we observe that manifold neural networks exhibit a trade-off between stability and discriminability. Finally, we illustrate our results empirically in a wireless resource allocation scenario where the transmitter-receiver pairs are assumed to be sampled from a manifold.
- Conference Article
7
- 10.1109/icassp49357.2023.10094894
- Jun 4, 2023
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data. Based on matrix multiplications, convolutions incur in high computational costs leading to scalability limitations in practice. To overcome these limitations, proposed methods rely on training GNNs in smaller number of nodes, and then transferring the GNN to larger graphs. Even though these methods are able to bound the difference between the output of the GNN with different number of nodes, they do not provide guarantees against the optimal GNN on the very large graph. In this paper, we propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon. We propose to grow the size of the graph as we train, and we show that our proposed methodology – learning by transference – converges to a neighborhood of a first order stationary point on the graphon data. A numerical experiment validates our proposed approach.