With the explosive growth in the number and diversity of Web services, correlative research has been investigated on Web service classification, as it fundamentally promotes advanced service-oriented applications, such as service discovery, selection, composition and recommendation. However, conventional approaches are restricted to indiscriminatingly classify Web services, which can trigger many challenges. First, they have not made full advantage of the implicit relationships among multi-dimensional information of Web services, such as the increasing number of service categories. Thus, it leads to low effectiveness of learning and representing service features, failing to ensure the overall accuracy of service classification. Second, the imbalance of service distributions has been ignored, while it is observed that service categories reveal distinct long-tail characteristics. That results in low accuracy on service classification for those categories that contain fewer Web services. To handle the challenges of more effectively learning implicit service features across the service repository, and with a particular concentration on those tail categories that contain fewer Web services, we propose a novel framework called DeepLTSC to more accurately perform the task of Web service classification under long-tail distributions. In DeepLTSC, we first present an improved label attentive convolutional deep neural network (LACNN) with service categories, which can generate deep service features to improve the overall classification performance. Then, a proposed service feature augmentation model (SFA) together with focal loss function is integrated into DeepLTSC to further optimize service features, aiming to boost the classification accuracy on tail service categories. Extensive experiments are conducted on three large-scale real-world services datasets with different long-tail distributions. The results demonstrate that DeepLTSC significantly outperforms state-of-the-art approaches for Web service classification on both overall and tail categories.
Read full abstract