Abstract

Content-based recommendation systems offer the possibility of promoting media (e.g., posts, videos, podcasts) to users based solely on a representation of the content (i.e., without using any user-related data such as views or interactions between users and items). In this work, we study the potential of using different textual representations (based on the content of the media) and semantic representations (created from a knowledge graph of media metadata). We also show that using off-the-shelf automatic annotation tools from the Information Extraction literature, we can improve recommendation performance, without any extra cost of training, data collection or annotation. We first evaluate multiple textual content representations on two tasks of recommendation: user-specific, which is performed by suggesting new items to the user given a history of interactions, and item-based, which is based solely on content relatedness, and is rarely investigated in the literature of recommender systems. We compare how using automatically extracted content (via ASR) compares to using human-written summaries. We then derive a semantic content representation by combining manually created metadata and automatically extracted annotations and we show that Knowledge Graphs, through their embeddings, constitute a great modality to seamlessly integrate extracted knowledge to legacy metadata and can be used to provide good content recommendations. We finally study how combining both semantic and textual representations can lead to superior performance on both recommendation tasks. Our code is available at https://github.com/D2KLab/ka-recsys to support experiment reproducibility.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call