Abstract

Rapid breakthrough in technology and reduced storage cost permit the individuals and organizations to generate and gather an enormous amount of text data. Extracting user interested documents from this gigantic amount of text data is a tedious job. This necessitates the development of text mining method for discovering interesting information or knowledge from the massive data. Document clustering is an effective text mining method which classifies the similar set of documents into the most relevant groups. K-means is the most classic clustering algorithm. However, results obtained by K-means highly depend on initial cluster centers and might be trapped in local optima. The paper presents a K-means document clustering algorithm with optimized initial cluster centers based on genetic algorithm. Experimental studies conducted over two different text datasets confirm that clustering results are more accurate by the application of the proposed method compared to K-means clustering.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.