Graph Datasets Research Articles

Graph Neural Networks (GNNs) have gained prominence in various domains, such as social network analysis, recommendation systems, and drug discovery, due to their ability to model complex relationships in graph-structured data. GNNs can exhibit incorrect behavior, resulting in severe consequences. Therefore, testing is necessary and pivotal. However, labeling all test inputs for GNNs can be prohibitively costly and time-consuming, especially when dealing with large and complex graphs. In response to these challenges, test selection has emerged as a strategic approach to alleviate labeling expenses. The objective of test selection is to select a subset of tests from the complete test set. While various test selection techniques have been proposed for traditional deep neural networks (DNNs), their adaptation to GNNs presents unique challenges due to the distinctions between DNN and GNN test data. Specifically, DNN test inputs are independent of each other, whereas GNN test inputs (nodes) exhibit intricate interdependencies. Therefore, it remains unclear whether DNN test selection approaches can perform effectively on GNNs. To fill the gap, we conduct an empirical study that systematically evaluates the effectiveness of various test selection methods in the context of GNNs, focusing on three critical aspects: 1) Misclassification detection: selecting test inputs that are more likely to be misclassified; 2) Accuracy estimation: selecting a small set of tests to precisely estimate the accuracy of the whole testing set; 3) Performance enhancement: selecting retraining inputs to improve the GNN accuracy. Our empirical study encompasses 7 graph datasets and 8 GNN models, evaluating 22 test selection approaches. Our study includes not only node classification datasets but also graph classification datasets. Our findings reveal that: 1) In GNN misclassification detection, confidence-based test selection methods, which perform well in DNNs, do not demonstrate the same level of effectiveness; 2) In terms of GNN accuracy estimation, clustering-based methods, while consistently performing better than random selection, provide only slight improvements; 3) Regarding selecting inputs for GNN performance improvement, test selection methods, such as confidence-based and clustering-based test selection methods, demonstrate only slight effectiveness; 4) Concerning performance enhancement, node importance-based test selection methods are not suitable, and in many cases, they even perform worse than random selection.

Read full abstract

The acquisition, processing, mining, and visualization of sensory data for knowledge discovery and decision support has recently been a popular area of research and exploration. Its usefulness is paramount because of its relationship to the continuous involvement in the improvement of healthcare and other related disciplines. As a result of this, a huge amount of data have been collected and analyzed. These data are made available for the research community in various shapes and formats; their representation and study in the form of graphs or networks is also an area of research which many scholars are focused on. However, the large size of such graph datasets poses challenges in data mining and visualization. For example, knowledge discovery from the Bio-Mouse-Gene dataset, which has over 43 thousand nodes and 14.5 million edges, is a non-trivial job. In this regard, summarizing the large graphs provided is a useful alternative. Graph summarization aims to provide the efficient analysis of such complex and large-sized data; hence, it is a beneficial approach. During summarization, all the nodes that have similar structural properties are merged together. In doing so, traditional methods often overlook the importance of personalizing the summary, which would be helpful in highlighting certain targeted nodes. Personalized or context-specific scenarios require a more tailored approach for accurately capturing distinct patterns and trends. Hence, the concept of personalized graph summarization aims to acquire a concise depiction of the graph, emphasizing connections that are closer in proximity to a specific set of given target nodes. In this paper, we present a faster algorithm for the personalized graph summarization (PGS) problem, named IPGS; this has been designed to facilitate enhanced and effective data mining and visualization of datasets from various domains, including biosensors. Our objective is to obtain a similar compression ratio as the one provided by the state-of-the-art PGS algorithm, but in a faster manner. To achieve this, we improve the execution time of the current state-of-the-art approach by using weighted, locality-sensitive hashing, through experiments on eight large publicly available datasets. The experiments demonstrate the effectiveness and scalability of IPGS while providing a similar compression ratio to the state-of-the-art approach. In this way, our research contributes to the study and analysis of sensory datasets through the perspective of graph summarization. We have also presented a detailed study on the Bio-Mouse-Gene dataset, which was conducted to investigate the effectiveness of graph summarization in the domain of biosensors.

Read full abstract

Graph Datasets Research Articles

Related Topics

Articles published on Graph Datasets

Design your own universe: a physics-informed agnostic method for enhancing graph neural networks

NLA-GNN: Non-local information aggregated graph neural network for heterogeneous graph embedding

Memristive Crossbar Array-Based Probabilistic Graph Modeling.

Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning.

MDD-FedGNN: A vertical federated graph learning framework for malicious domain detection

Hybrid Quantum or Purely Classical? Assessing the Utility of Quantum Feature Embeddings

Graph classification using high-difference-frequency subgraph embedding

Data Pruning-enabled High Performance and Reliable Graph Neural Network Training on ReRAM-based Processing-in-Memory Accelerators

TransE-MTP: A New Representation Learning Method for Knowledge Graph Embedding with Multi-Translation Principles and TransE

Graph Relearn Network: Reducing performance variance and improving prediction accuracy of graph neural networks

Sparse graphs-based dynamic attention networks

Fair-RGNN: Mitigating Relational Bias on Knowledge Graphs

ADPSCAN: Structural Graph Clustering with Adaptive Density Peak Selection and Noise Re-Clustering

Towards Exploring the Limitations of Test Selection Techniques on Graph Neural Networks: An Empirical Study

Diverse joint nonnegative matrix tri-factorization for attributed graph clustering

Edge Deletion based Subgraph Hiding

Behavior Based Group Recommendation from Social Media Dataset by Using Deep Learning and Topic Modeling

Enhanced Data Mining and Visualization of Sensory-Graph-Modeled Datasets through Summarization.

A semantic backdoor attack against graph convolutional networks

Attributed graph subspace clustering with residual compensation guided by adaptive dual manifold regularization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Graph Datasets Research Articles

Related Topics

Articles published on Graph Datasets

Design your own universe: a physics-informed agnostic method for enhancing graph neural networks

NLA-GNN: Non-local information aggregated graph neural network for heterogeneous graph embedding

Memristive Crossbar Array-Based Probabilistic Graph Modeling.

Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning.

MDD-FedGNN: A vertical federated graph learning framework for malicious domain detection

Hybrid Quantum or Purely Classical? Assessing the Utility of Quantum Feature Embeddings

Graph classification using high-difference-frequency subgraph embedding

Data Pruning-enabled High Performance and Reliable Graph Neural Network Training on ReRAM-based Processing-in-Memory Accelerators

TransE-MTP: A New Representation Learning Method for Knowledge Graph Embedding with Multi-Translation Principles and TransE

Graph Relearn Network: Reducing performance variance and improving prediction accuracy of graph neural networks

Sparse graphs-based dynamic attention networks

Fair-RGNN: Mitigating Relational Bias on Knowledge Graphs

ADPSCAN: Structural Graph Clustering with Adaptive Density Peak Selection and Noise Re-Clustering

Towards Exploring the Limitations of Test Selection Techniques on Graph Neural Networks: An Empirical Study

Diverse joint nonnegative matrix tri-factorization for attributed graph clustering

Edge Deletion based Subgraph Hiding

Behavior Based Group Recommendation from Social Media Dataset by Using Deep Learning and Topic Modeling

Enhanced Data Mining and Visualization of Sensory-Graph-Modeled Datasets through Summarization.

A semantic backdoor attack against graph convolutional networks

Attributed graph subspace clustering with residual compensation guided by adaptive dual manifold regularization