Abstract

In this paper, we propose a new approach to solve the document-clustering using the K-Means algorithm. The latter is sensitive to the random selection of the k cluster centroids in the initialization phase. To evaluate the quality of K-Means clustering we propose to model the text document clustering problem as the max stable set problem (MSSP) and use continuous Hopfield network to solve the MSSP problem to have initial centroids. The idea is inspired by the fact that MSSP and clustering share the same principle, MSSP consists to find the largest set of nodes completely disconnected in a graph, and in clustering, all objects are divided into disjoint clusters. Simulation results demonstrate that the proposed K-Means improved by MSSP (KM_MSSP) is efficient of large data sets, is much optimized in terms of time, and provides better quality of clustering than other methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.