Text document clustering (TDC) represents a key task in text mining and unsupervised machine learning, which partitions a specific documents’ collection into varied K-groups according to certain similarity/dissimilarity criterion. There exists a considerable amount of knowledge in the text clustering field and many attempts were carried out to resolve the TDC problem and improve the learning performance. The multi-verse optimizer algorithm (MVO) is a stochastic population-based algorithm, which was recently introduced and successfully utilized to tackle many optimization problems that are complex. The original MVO performance is limited to the utilization of only the best solution in the exploitation phase (local search capability), which makes it suffer from entrapment in local optima and low convergence rate. This paper aims to propose a novel method of modifying the MVO algorithm called link-based Multi-verse optimizer algorithm (LBMVO) to enhance the exploitation phase in the original MVO. The enhancement involves adding a neighbor operator to the MVO algorithm to enhance the search capability via a novel probability factor, namely neighborhood selection strategy (NSS). The proposed LBMVO’s effectiveness was tested on six standard datasets, which are used in the text clustering domain in addition to five standard datasets, which are utilized in the data clustering domain. The experiments revealed that the modified MVO with NSS has boosted the results in terms of error rate, accuracy, recall, precision, F-measure, purity, entropy criteria, and high convergence rate. Generally, LBMVO has outperformed or at least showed that it is profoundly competitive compared with the original MVO algorithm and with widely known clustering techniques like Spectral, Agglomerative, Density-based spatial clustering of applications with noise (DBSCAN), K-means, K-means++ clustering techniques and the optimization algorithms like harmony search (HS), genetic algorithm (GA), particle swarm optimization (PSO), krill herd algorithm (KHA), covariance matrix adaptation evolution strategy (CMAES), coyote optimization algorithm (COA), as well as original MVO.
Read full abstract