Abstract

The rapid growth of the number of data brings great challenges to clustering, especially the introduction of multi-view data, which collected from multiple sources or represented by multiple features, makes these challenges more arduous. How to clustering large-scale data efficiently has become the hottest topic of current large-scale clustering tasks. Although several accelerated multi-view methods have been proposed to improve the efficiency of clustering large-scale data, they still cannot be applied to some scenarios that require high efficiency because of the high computational complexity. To cope with the issue of high computational complexity of existing multi-view methods when dealing with large-scale data, a fast multi-view clustering model via nonnegative and orthogonal factorization (FMCNOF) is proposed in this paper. Instead of constraining the factor matrices to be nonnegative as traditional nonnegative and orthogonal factorization (NOF), we constrain a factor matrix of this model to be cluster indicator matrix which can assign cluster labels to data directly without extra post-processing step to extract cluster structures from the factor matrix. Meanwhile, the F-norm instead of the L2-norm is utilized on the FMCNOF model, which makes the model very easy to optimize. Furthermore, an efficient optimization algorithm is proposed to solve the FMCNOF model. Different from the traditional NOF optimization algorithm requiring dense matrix multiplications, our algorithm can divide the optimization problem into three decoupled small size subproblems that can be solved by much less matrix multiplications. Combined with the FMCNOF model and the corresponding fast optimization method, the efficiency of the clustering process can be significantly improved, and the computational complexity is nearly O(n) . Extensive experiments on various benchmark data sets validate our approach can greatly improve the efficiency when achieve acceptable performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call