Abstract

The problem of detecting communities in a graph is maybe one the most studied inference problems, given its simplicity and widespread diffusion among several disciplines. A very common benchmark for this problem is the stochastic block model or planted partition problem, where a phase transition takes place in the detection of the planted partition by changing the signal-to-noise ratio. Optimal algorithms for the detection exist which are based on spectral methods, but we show these are extremely sensible to slight modification in the generative model. Recently Javanmard, Montanari and Ricci-Tersenghi [1] have used statistical physics arguments, and numerical simulations to show that finding communities in the stochastic block model via semidefinite programming is quasi optimal. Further, the resulting semidefinite relaxation can be solved efficiently, and is very robust with respect to changes in the generative model. In this paper we study in detail several practical aspects of this new algorithm based on semidefinite programming for the detection of the planted partition. The algorithm turns out to be very fast, allowing the solution of problems with O(105) variables in few second on a laptop computer.

Highlights

  • Introduction and model definitionWhen dealing with a high-dimensional dataset one often looks for hidden structures, that may be representative of the signal one is trying to extract from the noisy dataset

  • A new spectral method based on the non-backtracking matrix introduced in Ref. [9] achieves optimality in the detection of the planted partition in the stochastic block model (SBM), at the cost of computing the complex spectrum of a non symmetric matrix. Later, such a spectral method has been strongly simplified by showing its similarity to the computation of the spectrum of the so-called Bethe Hessian matrix [10], which is a n × n symmetric matrix defined as H(r) = (r2 − 1)1 − rA − D, where A is the adjacency matrix, Aij = Aji = I[(ij) ∈ E], and D is a diagonal matrix with entries equal to the vertex degrees di. is detected by computing the negative

  • The partition detection based on the Bethe Hessian is optimal for the SBM, it turns out to be not very robust if the generative model departs even slightly from the random graph doi:10.1088/1742-6596/699/1/012015

Read more

Summary

Introduction and model definition

When dealing with a high-dimensional dataset one often looks for hidden structures, that may be representative of the signal one is trying to extract from the noisy dataset. [9] achieves optimality in the detection of the planted partition in the SBM, at the cost of computing the complex spectrum of a non symmetric matrix Later, such a spectral method has been strongly simplified by showing its similarity to the computation of the spectrum of the so-called Bethe Hessian matrix [10], which is a n × n symmetric matrix defined as H(r) = (r2 − 1)1 − rA − D , where A is the adjacency matrix, Aij = Aji = I[(ij) ∈ E], and D is a diagonal matrix with entries equal to the vertex degrees di. Turns out to be given by the vector of signs of the components of the eigenvector corresponding to the second largest (in absolute value) eigenvalue

Spectral methods versus optimization methods
Our community detection algorithm and it performances
Findings
Conclusions and perspectives
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call