Abstract

Multimedia-based recommendation is a challenging task that requires not only learning collaborative signals from user-item interaction, but also capturing modality-specific user interest clues from complex multimedia content. Though significant progress on this challenge has been made, we argue that current solutions remain limited by multimodal noise contamination. Specifically, a considerable proportion of multimedia content is irrelevant to the user preference, such as the background, overall layout, and brightness of images; the word order and semantic-free words in titles; <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">etc</i> . We take this irrelevant information as noise contamination to discover user preferences. Moreover, most recent research has been conducted by graph learning. This means that noise is diffused into the user and item representations with the message propagation; the contamination influence is further amplified. To tackle this problem, we develop a novel framework named Multimodal Graph Contrastive Learning (MGCL), which captures collaborative signals from interactions and uses visual and textual modalities to respectively extract modality-specific user preference clues. The key idea of MGCL involves two aspects: First, to alleviate noise contamination during graph learning, we construct three parallel graph convolution networks to independently generate three types of user and item representations, containing collaborative signals, visual preference clues, and textual preference clues. Second, to eliminate as much preference-independent noisy information as possible from the generated representations, we incorporate sufficient self-supervised signals into the model optimization with the help of contrastive learning, thus enhancing the expressiveness of the user and item representations. Note that MGCL is not limited to graph learning schema, but also can be applied to most matrix factorization methods. We conduct extensive experiments on three public datasets to validate the effectiveness and scalability of MGCL <xref ref-type="fn" rid="fn1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><sup>1</sup></xref> <fn id="fn1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><label><sup>1</sup></label> We release the codes of MGCL at <uri>https://github.com/hfutmars/MGCL</uri>. </fn> .

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call