Abstract

Dealing with relational learning generally relies on tools modeling relational data. An undirected graph can represent these data with vertices depicting entities and edges describing the relationships between the entities. These relationships can be well represented by multiple undirected graphs over the same set of vertices with edges arising from different graphs catching heterogeneous relations. The vertices of those networks are often structured in unknown clusters with varying properties of connectivity. These multiple graphs can be structured as a three-way tensor, where each slice of tensor depicts a graph which is represented by a count data matrix. To extract relevant clusters, we propose an appropriate model-based co-clustering capable of dealing with multiple graphs. The proposed model can be seen as a suitable tensor extension of mixture models of graphs, while the obtained co-clustering can be treated as a consensus clustering of nodes from multiple graphs. Applications on real datasets and comparisons with multi-view clustering and tensor decomposition methods show the interest of our contribution.

Highlights

  • Relational data are ubiquitous in various fields, and their accessibility has kept increasing in recent years

  • Thereby, in order to apply Tensor SPLBM (TSPLBM) for image datasets where graphs represent similarities between images according to each type of feature, we had to convert these matrices into binary adjacency matrices (1 if the similarity is higher than ninety-seven percent quantile and 0 otherwise)

  • We explore in the part, how implicit consensus clustering through TSPLBM behaves compared to explicit consensus through cluster ensembles of multiple graphs

Read more

Summary

Introduction

Relational data are ubiquitous in various fields (web, biology, neurology, sociology, communication, economics, etc.), and their accessibility has kept increasing in recent years. To deal with multiple graphs, various models and methods under different approaches are proposed to analyze these networks. To the best of our knowledge, this is the first attempt to formulate a model-based co-clustering for sparse three-way data To this end, we rely on the latent block model (Govaert and Nadif 2013) for its flexibility to consider any data matrices. – We propose a Sparse PLBM (SPLBM), a suitable probabilistic model for clustering of multiple graphs. – using the ensemble method, we prove that the proposed algorithm, which can be viewed as an implicit consensus clustering for multiple graphs, is more effective than explicit clustering obtained by traditional consensus clustering methods.

Related work
Poisson latent and stochastic block models
Poisson stochastic block model
Three-way tensor characteristics
Definition of the proposed model
Variational inference
Experiments
Datasets description and pre-processing
TSPLBM versus multi-view clustering
TSPLBM versus and tensor decomposition approaches
Interpretation of multiple graph clustering results
Discussion: implicit consensus versus explicit consensus
Towards a consensus clustering
Ensemble method
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call