Abstract

Nowadays, the recommendation systems are applied in the fields of e-commerce, video websites, social networking sites, which bring great convenience to people's daily lives. The types of information are diversified and abundant in recommendation systems; therefore the proportion of unstructured multimodal data such as text, image, and video is increasing. However, due to the representation gap between different modalities, it is intractable to effectively use unstructured multimodal data to improve the efficiency of recommendation systems. In this paper, we propose an end-to-end multimodal interest-related item similarity model (multimodal IRIS) to provide recommendations based on the multimodal data source. Specifically, the multimodal IRIS model consists of three modules, i.e., multimodal feature learning module, the interest-related network (IRN) module, and item similarity recommendation module. The multimodal feature learning module adds knowledge sharing unit among different modalities. Then, IRN learns the interest relevance between target item and different historical items respectively. Finally, the multimodal feature learning, IRN, and item similarity recommendation modules are unified into an integrated system to achieve performance enhancements and to accommodate the addition or absence of different modal data. Extensive experiments on real-world datasets show that, by dealing with the multimodal data which people may pay more attention to when selecting items, the proposed multimodal IRIS significantly improves accuracy and interpretability on top-N recommendation task over the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call