Abstract Since the last decade, the collective intelligent behavior of groups of animals, birds or insects have attracted the attention of researchers. Swarm intelligence is the branch of artificial intelligence that deals with the implementation of intelligent systems by taking inspiration from the collective behavior of social insects and other societies of animals. Many meta-heuristic algorithms based on aggregative conduct of swarms through complex interactions with no supervision have been used to solve complex optimization problems. Data clustering organizes data into groups called clusters, such that each cluster has similar data. It also produces clusters that could be disjoint. Accuracy and efficiency are the important measures in data clustering. Several recent studies describe bio-inspired systems as information processing systems capable of some cognitive ability. However, existing popular bio-inspired algorithms for data clustering ignored good balance between exploration and exploitation for producing better clustering results. In this article, we propose a bio-inspired algorithm, namely social spider optimization (SSO), for clustering that maintains a good balance between exploration and exploitation using female and male spiders, respectively. We compare results of the proposed algorithm SSO with K means and other nature-inspired algorithms such as particle swarm optimization (PSO), ant colony optimization (ACO) and improved bee colony optimization (IBCO). We find it to be more robust as it produces better clustering results. Although SSO solves the problem of getting stuck in the local optimum, it needs to be modified for locating the best solution in the proximity of the generated global solution. Hence, we hybridize SSO with K means, which produces good results in local searches. We compare proposed hybrid algorithms SSO+K means (SSOKC), integrated SSOKC (ISSOKC), and interleaved SSOKC (ILSSOKC) with K means+PSO (KPSO), K means+genetic algorithm (KGA), K means+artificial bee colony (KABC) and interleaved K means+IBCO (IKIBCO) and find better clustering results. We use sum of intra-cluster distances (SICD), average cosine similarity, accuracy and inter-cluster distance to measure and validate the performance and efficiency of the proposed clustering techniques.
Read full abstract