Abstract

With the emergence of online music platforms, music recommender systems are becoming increasingly crucial in music information retrieval. Knowledge graphs (KGs) are a rich source of semantic information for entities and relations, allowing for improved modeling and analysis of entity relations to enhance recommendations. Existing research has primarily focused on the modeling and analysis of structural triples, while largely ignoring the representation and information processing capabilities of multi-modal data such as music videos and lyrics, which has hindered the improvement and user experience of music recommender systems. To address these issues, we propose a Multi-modal Knowledge Graph Convolutional Network (MKGCN) to enhance music recommendation by leveraging the multi-modal knowledge of music items and their high-order structural and semantic information. Specifically, there are three aggregators in MKGCN: the multi-modal aggregator aggregates the text, image, audio, and sentiment features of each music item in a multi-modal knowledge graph (MMKG); the user aggregator and item aggregator use graph convolutional networks to aggregate multi-hop neighboring nodes on MMKGs to model high-order representations of user preferences and music items, respectively. Finally, we utilize the aggregated embedding representations for recommendation. In training MKGCN, we adopt the ratio negative sampling strategy to generate high-quality negative samples. We construct four different-sized music MMKGs using the public dataset Last-FM and conduct extensive experiments on them. The experimental results demonstrate that MKGCN achieves significant improvements and outperforms several state-of-the-art baselines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call