Clustering is a popular technique that has proven its capability in diverse fields like data analytics, business intelligence, social mining, image recognition, document clustering and bioinformatics. This technique determines the valuable information from a pre-defined set of data and groups similar information into the same cluster. In literature, many algorithms have been presented based on several clustering approaches. It is observed that partitional clustering algorithms are widely popular due to there simplicity and easy implementation such as k-means, k-medoids, and k-harmonic means. However, traditional algorithms suffer from several limitations like being trapped in local optima and depending on the initial solution quality. Heuristic algorithms are also proposed to alleviate the problems of traditional clustering algorithms. But, sometimes, these algorithms also get stuck in local minima and exhibit a lack of balance between search mechanisms. Hence, this work presents an improved water flow optimizer (IWFO) algorithm for cluster analysis that can address the issues of traditional and heuristic algorithms. In the proposed IWFO algorithm, the initial solution is generated based on the logistic chaotic map instead of random initialization, and in turn, an optimal quality solution is generated. The search mechanism of the WFO algorithm is enhanced based on an improved search mechanism which is the combination of non-linear functions and the previous best solution. Further, the local optima issue is alleviated using a multi-start search mechanism. The efficacy of the proposed IWFO algorithm is evaluated using twelve benchmark clustering datasets and results are compared with seventeen clustering algorithms. The simulation results are assessed using intra-cluster distance (intra), standard deviation (SD), rank, accuracy rate (AR) and detection rate (DR) parameters. Further, a statistical test is also conducted to validate the efficacy of the proposed IWFO algorithm. The results showed that proposed IWFO algorithms obtain superior quality results on most of the datasets.
Read full abstract