Abstract

With the rapid growth in medicine, it is essential to determine a method of cluster drug composition data to make it easy for industries to define medicine composition. K-means clustering is one way to cluster the composition of drugs. In this paper, we use the Word2Vec model and convert the composition of the drug into a vector. We cluster it using K-means, also visualize the data results of the clustering. In Word2Vec, we use two methods, namely CBOW and SG. Meanwhile, in K-means, we determine the number of centroids using the Elbow Criterion and Silhouette Coefficient method. Datasets consist of more than 250 product names of drug from Farmaku and K24. The experiment results show that the Silhouette Coefficient value using the CBOW and SG methods are 0.901 and 0.877. Both CBOW and SG method generating the best value of the number of clusters is three.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call