Matrix factorization is one of the fundamental approaches of recommender systems. With the popular L2 loss, learning models tend to overfit significantly deviated predictions. However, predicting the actual rating of 5 as 1 or 2 makes no essential difference in the application. In this paper, we design a sigmoid-like function to control the loss of each individual prediction, which has two advantages. First, it reduces the loss corresponding to significantly deviated predictions. Therefore, the impact of these predictions, some of which may be caused by outliers, is also reduced. Second, it is independent of two classical over-fitting control techniques using regular terms and validation data, respectively. Hence, it can be combined with them to form a more powerful method. Experiments are undertaken on six benchmark datasets in comparison with different losses. Results show that the proposed loss function has good performance in terms of MAE, RMSE, and NDCG, however not so good in terms of HR and MAP.
Read full abstract