Kernel Spectral Clustering for Big Data Networks

Raghvendra Mall,Johan Suykens,Rocco Langone

doi:10.3390/e15051567

Abstract

This paper shows the feasibility of utilizing the Kernel Spectral Clustering (KSC) method for the purpose of community detection in big data networks. KSC employs a primal-dual framework to construct a model. It results in a powerful property of effectively inferring the community affiliation for out-of-sample extensions. The original large kernel matrix cannot fitinto memory. Therefore, we select a smaller subgraph that preserves the overall community structure to construct the model. It makes use of the out-of-sample extension property for community membership of the unseen nodes. We provide a novel memory- and computationally efficient model selection procedure based on angular similarity in the eigenspace. We demonstrate the effectiveness of KSC on large scale synthetic networks and real world networks like the YouTube network, a road network of California and the Livejournal network. These networks contain millions of nodes and several million edges.

Highlights

In the modern era, complex networks are ubiquitous
We show that kernel spectral clustering is applicable for community detection in big data networks
For the Infomap [7] and Louvain [9] community detection techniques, we evaluate the subset obtained by Fast and Unique Representative Subset (FURS) on various metrics such as computation time, clustering coefficients (CCF), coverage, variation of information (VI)

Summary

Introduction

Their omnipresence is reflected in domains like social networks, web graphs, road graphs, communication networks, biological networks and financial networks. Entropy 2013, 15 vertices in the graph and edges (E) depict the relationship between these nodes. These networks exhibit community like structure, where nodes within a community are densely connected and the connections are sparse between the communities. The major drawback of these spectral clustering methods is the construction of the large affinity matrix (N × N ), where N is the number of nodes in the network, which requires to calculate the similarity between every pair of nodes in the network. As the size of the network increases, the O(N 2 ) computation and storage of this affinity N × N matrix become infeasible

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: May 3, 2013
Citations: 84	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Kernel Spectral Clustering for Big Data Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Self-tuned kernel spectral clustering for large scale networks
Raghvendra Mall ... Rocco Langone
-
Raghvendra Mall, et. al.Raghvendra Mall ... Rocco Langone
01 Oct 2013
01 Oct 2013

HIGH PERFORMANCE DECENTRALISED COMMUNITY DETECTION ALGORITHMS FOR BIG DATA FROM SMART COMMUNICATION APPLICATIONS

-

11 Apr 2018
11 Apr 2018

Highly Sparse Reductions to Kernel Spectral Clustering
Raghvendra Mall ... Johan A K Suykens
-
Raghvendra Mall, et. al.Raghvendra Mall ... Johan A K Suykens
01 Jan 2013
01 Jan 2013

Fitting truncated geometric distributions in large scale real world networks
Swarup Chattopadhyay ... Sankar K Pal
Theoretical Computer Science | VOL. 551
Swarup Chattopadhyay, et. al.Swarup Chattopadhyay ... Sankar K Pal
15 May 2014
Theoretical Computer Science | VOL. 551

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Kernel Spectral Clustering for Big Data Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy