Abstract

The cold-start scenario is a critical problem for recommendation systems, especially in dynamically changing domains such as online news services. In this research, we aim at addressing the cold-start situation by adapting an unsupervised neural User2Vec method to represent new users and articles in a multidimensional space. Toward this goal, we propose an extension of the Doc2Vec model that is capable of representing users with unknown history by building embeddings of their metadata labels along with item representations. We evaluate our proposed approach with respect to different parameter configurations on three real-world recommendation datasets with different characteristics. Our results show that this approach may be applied as an efficient alternative to the factorization machine-based method when the user and item metadata are used and hence can be applied in the cold-start scenario for both new users and new items. Additionally, as our solution represents the user and item labels in the same vector space, we can analyze the spatial relations among these labels to reveal latent interest features of the audience groups as well as possible data biases and disparities.

Highlights

  • Both approaches aim at learning user and item metadata embeddings for improving the recommendations in the cold-start setting; LightFM is an extended version of the matrix factorization method, while our work is based on the neural Doc2Vec model of Mikolov et al (2013)

  • We propose an extension of the basic User2Vec model: Meta-User2Vec method, which takes an additional input of user and item metadata labels to be trained along with the id vectors, which enables inferring these features for new items and new users

  • The LightFM method performs significantly better in most settings when only the user and item ids are available

Read more

Summary

From word vectors to user embeddings

The collaborative filtering user-profiling problem may be represented analogously to an NLP task: the items can be considered as words in a corpus, the user profile as a document and the sequences of user actions (such as buying a product or reading an article) as sentences. Word2Vec and Doc2Vec techniques have proved to be competitive alternatives to the traditional matrix factorization methods in the context of collaborative filtering recommendation systems and, due to its relative simplicity and efficiency, have been successfully applied for real-world recommender systems in different domains as described by McCormick (2018) such as venues in Ozsoy (2016); Grbovic (2018), e-commerce in Phi et al (2016); Grbovic et al (2016) and music in Barkan et al (2016); Karam (2017) It was found by Caselles-Dupré et al (2018) that the optimal parameters of the Word2Vec model for recommendation tasks are significantly different than for NLP. It was observed that this approach gives comparable results to the standard collaborative filtering techniques, especially for sparse datasets

User modeling in cold‐start situations
Proposed approach
Experimental validation
Models used for comparison
Datasets
MovieLens 100K dataset
Deskdrop dataset
Onet dataset
Experimental tasks
Evaluation procedure
Results
A comparison of the proposed method with other approaches and the baselines
Representing latent relations among user metadata and content
Results summary and discussion
Conclusions and future work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.