Random Sampling Method of Large-Scale Graph Data Classification

Rashed Mustafa,Mohammad Sultan Mahmud,Mahir Shadid

doi:10.17576/jkukm-2024-36(2)-14

Abstract

Graph data appears in broad real-world applications in modelling complex objects in big data. Effective analysis of graph data provides a deeper understanding of the data in data mining tasks, including classification, clustering, prediction, and recommendation systems. Mining a large number of graphs becomes a challenging task because state-of-the-art methods are not scalable due to the memory limit. To address this issue, we propose a novel approximate random sampling method for large-scale graph data classification. In this approach, we applied a representation method to encode each graph as a record of a vector string and a set of graphs as a set of N records in a file. Then, we partition the set of records into disjoint subsets of data blocks, making each data block a random sample of the data file. After that, we randomly select a subset of data blocks, each being a random sample of the graph dataset, and compute the different graph property distributions. Since the data blocks in this model are much smaller than the entire data set, it is more efficient to analyze them on a standalone small machine, and multiple data blocks can be analyzed on multiple nodes of the cluster in parallel. Finally, we classified the graphs of data blocks using the SVM algorithm. In experimental evaluation, our proposed method outperformed state-of-the-art graph kernels on graph classification datasets in terms of accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Random Sampling Method of Large-Scale Graph Data Classification

Abstract

Talk to us

Similar Papers

More From: Jurnal Kejuruteraan

Lead the way for us

Similar Papers

Graph Kernels: State-of-the-Art and Future Challenges
Karsten Borgwardt ... Bastian Rieck
-
Karsten Borgwardt, et. al.Karsten Borgwardt ... Bastian Rieck
01 Jan 2020
01 Jan 2020

Execution Feature Extraction and Prediction for Large-Scale Graph Processing Applications
Fangyuan Li
-
Fangyuan LiFangyuan Li
01 Sep 2019
01 Sep 2019

Big Data of Complex Networks
-
-
--
19 Aug 2016
19 Aug 2016

Very large-scale data classification based on K-means clustering and multi-kernel SVM
Tinglong Tang ... Meng Zhao
Soft Computing | VOL. 23
Tinglong Tang, et. al.Tinglong Tang ... Meng Zhao
29 Jan 2018
Soft Computing | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Random Sampling Method of Large-Scale Graph Data Classification

Abstract

Talk to us

Similar Papers

More From: Jurnal Kejuruteraan