Abstract

AbstractIn this paper, we describe a neural network‐based application that recommends multiple items using dialog context input and simultaneously outputs a response sentence. Further, we describe a multi‐item recommendation by specifying it as a set of clothing recommendations. For this, a multimodal fusion approach that can process both cloth‐related text and images is required. We also examine achieving the requirements of downstream models using a pretrained language model. Moreover, we propose a gate‐based multimodal fusion and multiprompt learning based on a pretrained language model. Specifically, we propose an automatic evaluation technique to solve the one‐to‐many mapping problem of multi‐item recommendations. A fashion‐domain multimodal dataset based on Koreans is constructed and tested. Various experimental environment settings are verified using an automatic evaluation method. The results show that our proposed method can be used to obtain confidence scores for multi‐item recommendation results, which is different from traditional accuracy evaluation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call