More to diverse: Generating diversified responses in a task oriented multimodal dialog system

Haoran Xie,Arunav Pratap Shandeelya,Mauajama Firdaus,Asif Ekbal

doi:10.1371/journal.pone.0241271.r007

Abstract

Multimodal dialogue system, due to its many-fold applications, has gained much attention to the researchers and developers in recent times. With the release of large-scale multimodal dialog dataset Saha et al. 2018 on the fashion domain, it has been possible to investigate the dialogue systems having both textual and visual modalities. Response generation is an essential aspect of every dialogue system, and making the responses diverse is an important problem. For any goal-oriented conversational agent, the system’s responses must be informative, diverse and polite, that may lead to better user experiences. In this paper, we propose an end-to-end neural framework for generating varied responses in a multimodal dialogue setup capturing information from both the text and image. Multimodal encoder with co-attention between the text and image is used for focusing on the different modalities to obtain better contextual information. For effective information sharing across the modalities, we combine the information of text and images using the BLOCK fusion technique that helps in learning an improved multimodal representation. We employ stochastic beam search with Gumble Top K-tricks to achieve diversified responses while preserving the content and politeness in the responses. Experimental results show that our proposed approach performs significantly better compared to the existing and baseline methods in terms of distinct metrics, and thereby generates more diverse responses that are informative, interesting and polite without any loss of information. Empirical evaluation also reveals that images, while used along with the text, improve the efficiency of the model in generating diversified responses.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

More to diverse: Generating diversified responses in a task oriented multimodal dialog system
Mauajama Firdaus ... Arunav Pratap Shandeelya
PLOS ONE | VOL. 15
Mauajama Firdaus, et. al.Mauajama Firdaus ... Arunav Pratap Shandeelya
05 Nov 2020
PLOS ONE | VOL. 15

Transformer-Based Multimodal Infusion Dialogue Systems
Bo Liu ... Tianyao Yu
Electronics | VOL. 11
Bo Liu, et. al.Bo Liu ... Tianyao Yu
20 Oct 2022
Electronics | VOL. 11

Knowledge-aware Multimodal Dialogue Systems
Lizi Liao ... Xiangnan He
-
Lizi Liao, et. al.Lizi Liao ... Xiangnan He
15 Oct 2018
15 Oct 2018

User Attention-guided Multimodal Dialog Systems
Chen Cui ... Xuemeng Song
-
Chen Cui, et. al.Chen Cui ... Xuemeng Song
18 Jul 2019
18 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

Abstract

Talk to us

Similar Papers