Abstract

Population data is an important piece of information that is useful for regional planning and development. Insight into the state of an area is more straightforward to observe if there are grouped sub-districts. In this case, data mining techniques can identify patterns and relationships in population data. The K-Means algorithm is a clustering technique that divides data into groups or clusters based on similar characteristics. This research aims to apply the K-Means method with various approaches to clustering sub-districts in the Bojonegoro district according to population data. The research method used is a quantitative method with an exploratory study in the application of the K-Means method with a variety of approaches, namely the use of the Kernel K-Means method by utilizing the mapping function to map data to a higher dimension before the clustering process. In addition, the Fast K-Means method is used, which reduces the model training time to improve the cluster-centered recalibration problem as the amount of data increases. The data source used in this research is secondary population data in the form of birth, death, migrant, and moving variables obtained from the Satu Data Bojonegoro website developed by the Bojonegoro Regency Government. It is found that the best K-Means approach is the Kernel K-Means method with a number of clusters of 5. The performance of the cluster method is evaluated by measuring the average distance within the cluster. The data coordinate pattern in the Kernel K-means method clustering shows a smooth initial trend when the value of the number of clusters is 5 so that the clusters formed are obtained clearly. The conclusion from this study's results is that the K-Means method's best approach in grouping sub-districts in Bojonegoro district is the Kernel K-Means approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call