K-means method with linear search algorithm to reduce Means Square Error (MSE) within data clustering

S Sriadhi,R Rahim,D Abdullah,S Gultom,M Martiano

doi:10.1088/1757-899x/434/1/012032

Abstract

K-means method is limited in identifying and grouping the data by characteristics similarity in clustering. This study develops K-Means method with LSA to fix the issue of objectivity in data clustering as compared to K-Means method used lately. Data variables used are study load (credits) and study period (semester) of students in two academic years, which is amounted to 1,089 records. Data is analysed by using comparative statistic between the results of clustering test using K-Means method and K-Means method with LSA. The test findings show that the clustering using K-Means only groups the data into 3 clusters while the use of K-Means method with LSA produces 5 clusters. There are 327 different characteristics data identified by K-Means method with LSA which are grouped in two new clauses so it results in five clusters, for what is rated similar by K-Means method which only produces three clusters. This study concludes that K-Means method with LSA is more objective in clustering the data clustering and reducing MSE level error due to the sensitivity of data similarity within the cluster as always happened with K-Means method. Therefore, it is recommended that K-Means method with LSA be used in clustering to objectively identify the data and avoid any errors in the clustering process for more optimal data utilization.

Full Text