Abstract

Classifying web services and labeling them based on their functional features have played a major role in several fundamental service management tasks, such as service discovery, selection, ranking, and recommendation. Existing approaches leverage text mining techniques and follow a supervised learning process, which involves building a classifier from a training set of services and applying the classifier to other services. This process requires intensive human effort on labeling services in the training set. In this paper, we propose to leverage the idea of pool-based active learning to realize a scalable service classification approach. Instead of manually labeling a large number of services to construct a complete training set, the approach starts with a base classifier with a small set of training set and iteratively asks for the labels of the most informative services outside of the initial training set. By doing this, the classifier can achieve comparable accuracy compared to traditional classification method with much smaller size of training set. We use SVM as the base classifier due to its effectiveness in text classification. We also incorporate probabilistic topic models to address the issues caused by sparse term vectors generated from service descriptions and reduce the dimensions to improve the efficiency. We conducted a comprehensive experimental study on real-world service data to demonstrate the effectiveness of the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call