Abstract

Vector space representation of web services plays a prominent role in enhancing the performance of different web service-based processes like clustering, recommendation, ranking, discovery, etc. Generally, Term Frequency - Inverse Document Frequency (TF-IDF) and topic modeling methods are widely used for service representation. In recent years, word embedding techniques have attracted researchers a lot because they can map services or documents based on semantic similarity. This paper provides a comparative analysis of two topic modeling techniques, i.e., Latent Dirichlet Allocation (LDA) and Gibbs Sampling algorithm for Dirichlet Multinomial Mixture (GSDMM) & two word embedding techniques, i.e., word2vec and fastText. These topic modeling and word embedding techniques are applied to a dataset of web service documents for vector space representation. K-Means clustering is used to analyze the performance, and results are evaluated based on standard evaluation criteria. Results demonstrate that word2vec model outperforms other techniques and provides a satisfactory improvement on clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call