Semi-supervised symmetric nonnegative matrix factorization (SNMF) has been extensively utilized in both linear and nonlinear data clustering tasks. However, the current SNMF model's non-convex objective function faces challenges in global optimization and time efficiency. In this study, we leverage label information to propose a convex and unconstrained symmetric matrix factorization (SMF) model that is thoroughly analyzed for its convexity properties. In order to capture high-order relationships among data, a hypergraph is utilized in the model, which is computationally simple, translation invariant, and naturally normalized. Moreover, based on the analysis and the corresponding experiments in the paper, the model exhibits robustness towards outliers to some extent. Due to the convexity of our proposed model without constraint, it can be efficiently optimized using the Conjugate Gradient (CG) method, one of the most efficient methods available. Therefore, we propose a novel Convex Combination-based Sufficient Descent CG (CSDCG) method, which outperforms other methods across 284 optimization problems within the CUTEst library. In order to evaluate the effectiveness of the proposed method, the semi-supervised clustering experiments are conducted on the eight datasets by comparison with ten state-of-the-art matrix factorization (MF) methods. The experiment results demonstrate its superiority over the other compared methods to handle the clustering problem with better performance and less computational time. The code is available at https://github.com/Pokemer/HCSSMF.
Read full abstract