Abstract

Analyzing Covid-19 data has been conducted in many types of research, but research on classifying each case from Covid-19 data in all provinces in Indonesia has yet to be available. This study uses two clustering algorithms, namely K-Means and K-Medoids, to classify positive cases recovered and died in the Covid-19 data into three clusters, namely low, medium and high. The research data is Covid-19 case data in all provinces in Indonesia from 2020 to 2021. In the clustering calculations, the three distance methods used in this study are the Chebyshev Distance, Manhattan Distance, and Euclidean Distance. Based on the Silhouette Coefficient test results for the three distance calculation methods, it was found that Manhattan Distance is the best distance calculation method for K-Means and K-Medoids. Furthermore, the results of testing the Sum Squared Error (SSE), Silhouette Coefficient (SC) and Davies Index Bouldin (DBI) methods for the resulting clusters show that the value generated by the K-Means algorithm is higher in the SC and DBI methods. This result is evidenced by the SC value of 0.838; 0.838; and 0.925 in positive cases, recovered and died. While the DBI value is 0.305 for positive cases, 0.295 for recovered cases and 1.569 for dead cases. Based on these values, it proves that K-Means is superior in grouping and placing clusters compared to K-Medoids.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call