Abstract

Community detection is a fundamental problem in knowledge discovery and data mining. In this paper we propose a semi-binary matrix factorization (SBMF) model for community detection, which can be understood as a marriage between <inline-formula><tex-math notation="LaTeX">$K$</tex-math></inline-formula>-means clustering and (semi-)nonnegative matrix factorization. This leads to an easy-to-interpret factorization that can naturally handle overlapping communities. Unlike <inline-formula><tex-math notation="LaTeX">$K$</tex-math></inline-formula>-means, the proposed approach does not restrict each individual to belong to only a single community, nor does it restrict the sum of &#x201C;soft membership&#x201D; values to add up to one. We derive relatively easy-to-check uniqueness conditions suggesting that meaningful communities can be obtained via SBMF. Computing a (least-squares) optimal SBMF is a hard mixed integer nonconvex optimization problem. We bypass this challenge by converting the problem into a coupled matrix-tensor factorization form, which only involves continuous variables and can be tackled using tensor decomposition tools, and can also be used to initialize optimization based methods. We present experiments with real data to demonstrate the effectiveness of the proposed approach for community detection in coauthorship networks and in financial stock market data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call