An Algorithm for Density Enrichment of Sparse Collaborative Filtering Datasets Using Robust Predictions as Derived Ratings

Dionisis Margaris,Dimitris Spiliotopoulos,Gregory Karagiorgos,Costas Vassilakis

doi:10.3390/a13070174

Dionisis Margaris, Dimitris Spiliotopoulos + Show 2 more

Open Access

https://doi.org/10.3390/a13070174

Copy DOI

Abstract

Collaborative filtering algorithms formulate personalized recommendations for a user, first by analysing already entered ratings to identify other users with similar tastes to the user (termed as near neighbours), and then using the opinions of the near neighbours to predict which items the target user would like. However, in sparse datasets, too few near neighbours can be identified, resulting in low accuracy predictions and even a total inability to formulate personalized predictions. This paper addresses the sparsity problem by presenting an algorithm that uses robust predictions, that is predictions deemed as highly probable to be accurate, as derived ratings. Thus, the density of sparse datasets increases, and improved rating prediction coverage and accuracy are achieved. The proposed algorithm, termed as CFDR, is extensively evaluated using (1) seven widely-used collaborative filtering datasets, (2) the two most widely-used correlation metrics in collaborative filtering research, namely the Pearson correlation coefficient and the cosine similarity, and (3) the two most widely-used error metrics in collaborative filtering, namely the mean absolute error and the root mean square error. The evaluation results show that, by successfully increasing the density of the datasets, the capacity of collaborative filtering systems to formulate personalized and accurate recommendations is considerably improved.

Highlights

Collaborative filtering (CF) algorithms formulate personalized recommendations by taking into account users’ ratings that denote their interests and tastes
None reported; the algorithm is based on matrix factorization, a value is predicted for all items for all users
We presented the CFDR algorithm, which is a novel CF algorithm for enhancing the density of sparse CF datasets

Summary

Introduction

Collaborative filtering (CF) algorithms formulate personalized recommendations by taking into account users’ ratings that denote their interests and tastes. These algorithms identify the users that have highly similar interests and tastes with the user for whom the recommendation will be formulated. These users are called “near neighbours” (NNs). Their ratings are used in order to formulate rating predictions which will subsequently result in the formulation of recommendations [1,2].

Results

Discussion

Conclusion