Abstract

Concept factorization (CF) has attracted widespread attention for its promising performance in document clustering. Among various CF variants, graph-regularized CF is the most impressive type, which can improve clustering effectiveness by exploring structural information. Nevertheless, their clustering efficiency is restricted by the following considerations: (1) the introduction of the full-sample graph is accompanied by an increase in computational complexity; (2) most of them require intensive multiplications in optimization, which impair the optimization efficiency. To address these issues, in this work, we propose an anchor-graph regularized orthogonal concept factorization (AROCF) method to enhance the clustering efficiency and effectiveness in document clustering tasks. Firstly, AROCF approximates the full-sample graph with a small-scale anchor graph to reduce the complexity of graph construction from quadratic to linear. Then, one of the factor matrices is constrained as the cluster indicator matrix in our method, which can avoid extra efficiency loss in K-means after optimization. Finally, an orthogonal constraint is employed to restrict the freedom of factorization to increase the clustering effectiveness. To optimize the AROCF model, we develop a fast optimization strategy by combining the trace and orthogonality of matrices. Extensive experiments on various document datasets demonstrate the effectiveness and efficiency of AROCF.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call