Abstract

In this paper, we propose a modified variable string length genetic algorithm (MVGA) for text clustering. Our algorithm has been exploited for automatically evolving the optimal number of clusters as well as providing proper data set clustering. The chromosome is encoded by a string of real numbers with special indices to indicate the location of each gene. More effective versions of operators for selection, crossover, and mutation are introduced in MVGA which can also automatically adjust the influence between the diversity of the population and selective pressure during generations. The superiority of the MVGA over conventional variable string length genetic algorithm (VGA) is demonstrated by providing proper Reuter text collection clusters in terms of number of clusters and clustering data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.