Abstract

Similarity is a vital component for neighborhood-based collaborative filtering (CF). To improve the quality of recommendation, many similarity methods have been proposed and analyzed in recent decades. However, nearly all traditional similarity methods and many advanced similarity methods only utilize corated items among users to compute their similarity, which provides limited information in cold-start/sparse scenarios and yields misleading results. In addition, although a few advanced hybrid similarity models consider items beyond corated items, which can partly mitigate the above limitation, they still have drawbacks, such as disregarding penalizing noncorated items that have many disadvantages. In this paper, we explore a new robust hybrid similarity model, namely Wasserstein distance-based CF (WCF) model, for mitigating the cold-start problem of CF in sparse data. Specifically, we measure item similarity via the Wasserstein distance, which can help circumvent the drawbacks in the Bhattacharyya coefficient and KL divergence that are used in the literature, and is thus more robust in a cold-start/sparse scenario. Besides, we further design a new multiplicative user similarity formula which identifies all noncorated items as a whole to prioritize the importance of corated items and impair the negative effects of noncorated items, which will also play an important role in a cold-start/sparse scenario. In addition, we also propose two novel heuristic similarity factors to impair the negative effects of popular users and items as supplements. We conduct extensive experiments on five real-world benchmark recommendation datasets to test WCF. The experimental results show the superiority of WCF over other existing similarity methods in cold-start/sparse scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call