Abstract

News recommender systems efficiently handle the overwhelming number of news articles, simplify navigations, and retrieve relevant information. Many conventional news recommender systems use collaborative filtering to make recommendations based on the behavior of users in the system. In this approach, the introduction of new users or new items can cause the cold start problem, as there will be insufficient data on these new entries for the collaborative filtering to draw any inferences for new users or items. Contentbased news recommender systems emerged to address the cold start problem. However, many content-based news recommender systems consider documents as a bag-of-words neglecting the hidden themes of the news articles. In this paper, we propose a news recommender system leveraging topic models and time spent on each article. We build an automated recommender system that is able to filter news articles and make recommendations based on users' preferences. We use topic models to identify the thematic structure of the corpus. These themes are incorporated into a content-based recommender system to filter news articles that contain themes that are of less interest to users and to recommend articles that are thematically similar to users' preferences. Our experimental studies show that utilizing topic modeling and spent time on a single article can outperform the state of the arts recommendation techniques. The resulting recommender system based on the proposed method is currently operational at The Globe and Mail (http://www.theglobeandmail.com/).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.