Recently, clustering techniques gained more importance due to huge range of applications in the field of data mining, pattern recognition, data clustering, bio informatics and many other applications. In this paper, a new approach called spotted hyena bat algorithm (SHBA)-based incremental clustering with spark framework is proposed. The SHBA algorithm is derived by integrating the spotted hyena optimiser (SHO) and bat algorithm (BA), that is highly desirable for handling high dimensional data and provides a unique solution with high satisfactory results. The process of incremental clustering is performed in a spark framework by considering the master and the slave nodes. The proposed approach effectively clusters the data, especially high dimensional data and is more robust against various attacks and provides more unified solution. Moreover, the proposed SHBA achieves higher performance by considering the evaluation metrics, such as Jaccard coefficient, rand coefficient, and clustering accuracy of 0.950, 0.943, and 0.962.
Read full abstract