Abstract

Recommendation system plays an important role in the rapid development of micro-video sharing platform. Micro-video has rich modal features, such as visual, audio, and text. It is of great significance to carry out personalized recommendation by integrating multi-modal features. However, most of the current multi-modal recommendation systems can only enrich the feature representation on the item side, while it leads to poor learning of user preferences. To solve this problem, we propose a novel module named Learning the User’s Deeper Preferences (LUDP) , which constructs the item-item modal similarity graph and user preference graph in each modality to explore the learning of item and user representation. Specifically, we construct item-item similar modalities graph using multi-modal features, the item ID embedding is propagated and aggregated on the graph to learn the latent structural information of items; The user preference graph is constructed through the historical interaction between the user and item, on which the multi-modal features are aggregated as the user’s preference for the modal. Finally, combining the two parts as auxiliary information enhances the user and item representation learned from the collaborative signals to learn deeper user preferences. Through a large number of experiments on two public datasets (TikTok, Movielens), our model is proved to be superior to the most advanced multi-modal recommendation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call