STCLARanS: An Improved Clustering Large Applications based on Randomized Search Algorithm using Slim-tree Technique

Ricardo Q Camungao

doi:10.35940/ijrte.b1022.078219

Abstract

Clustering has been used for data interpretation when dealing with large database in the fields of medicines, business, engineering etc. for the recent years. Its existence paved way on the development of data mining techniques like CLARANS (Clustering Large Applications based on Randomized Search) Algorithm. It is the most efficient k-medoids technique that uses randomized strategy to identify the best medoids in a large dataset. Likewise, it surpasses the clustering performance of both PAM (Partitioning Around Medoids) and CLARA (Clustering Large Applications) in terms of time. This paper addresses the task of integrating Slim-tree method to CLARANS for the development of the proposed Slim-tree Clustering Large Applications based on Randomized Search (STCLARanS) Algorithm and an experimental evaluation was prepared using synthetic and real datasets for the comparison of the quality of the clustered output of the CLARANS and the proposed STCLARanS algorithms. The Slim-tree method is used for pre-clustering of the objects in the dataset in identifying the objects in the middle level as the sample objects used to start the clustering process. The proposed Algorithm assumes that with the new sampling strategy to draw the initial cluster centers to start the clustering process may yield to better quality of the clustered outputs as compared to the clustered output of the CLARANS algorithm. The quality of the clustered output is measured on the accumulated distances of the objects to their cluster centers.

Full Text