Multimodal Dialogue System Research Articles

ABSTRACT This paper presents a multimodal dialogue system with user personality adaptation and multiple nonverbal behavior generation for a conversational android robot that plays a customer service role in recommending travel plans. It is useful to estimate the user's personality to realize a dialogue system that adapts the dialogue strategy to individual users. To improve the user's impression of the conversation robot as a customer service agent, we must control the nonverbal behavior of the robot, such as its facial expressions and motions. Moreover, we need to design the dialogue strategy to provide information and make users enjoy the conversation by adding an ice-breaking step. Against this background, we develop a dialogue system prototype that adjusts the dialogue strategy based on the user's personality, as estimated by a pre-trained multimodal sensing model. Furthermore, we implement appropriate nonverbal behavior patterns for each situation, including the voice, motion, and facial expressions of the android robot, to provide it appropriate and polite service agent behaviors. A personality assessment game is introduced into the dialogue to prevent the user from becoming bored. The game plays an ice-breaking role in the dialogue. To let the user engage with the robot, we implement a complement to the user as a dialogue act. We evaluate the proposed multimodal dialogue system for customer service by means of participation in a dialogue robot competition. The fair evaluation results from 26 dialogue users who conversed with multiple systems show that the proposed dialogue system achieved the best score among 13 teams participating in the preliminary round of the competition, indicating that the user personality adaptation and various elements implemented in our dialogue system improve the dialogue experience of general users. The results of a third-party evaluation and multiple regression analysis show that natural response is the most important factor contributing to user impression.

Read full abstract

The rise of depression, anxiety, and suicide rates has led to increased demand for telemedicine-based mental health screening and remote patient monitoring (RPM) solutions to alleviate the burden on, and enhance the efficiency of, mental health practitioners. Multimodal dialog systems (MDS) that conduct on-demand, structured interviews offer a scalable and cost-effective solution to address this need. This study evaluates the feasibility of a cloud based MDS agent, Tina, for mental state characterization in participants with depression, anxiety, and suicide risk. Sixty-eight participants were recruited through an online health registry and completed 73 sessions, with 15 (20.6%), 21 (28.8%), and 26 (35.6%) sessions screening positive for depression, anxiety, and suicide risk, respectively using conventional screening instruments. Participants then interacted with Tina as they completed a structured interview designed to elicit calibrated, open-ended responses regarding the participants' feelings and emotional state. Simultaneously, the platform streamed their speech and video recordings in real-time to a HIPAA-compliant cloud server, to compute speech, language, and facial movement-based biomarkers. After their sessions, participants completed user experience surveys. Machine learning models were developed using extracted features and evaluated with the area under the receiver operating characteristic curve (AUC). For both depression and suicide risk, affected individuals tended to have a higher percent pause time, while those positive for anxiety showed reduced lip movement relative to healthy controls. In terms of single-modality classification models, speech features performed best for depression (AUC = 0.64; 95% CI = 0.51-0.78), facial features for anxiety (AUC = 0.57; 95% CI = 0.43-0.71), and text features for suicide risk (AUC = 0.65; 95% CI = 0.52-0.78). Best overall performance was achieved by decision fusion of all models in identifying suicide risk (AUC = 0.76; 95% CI = 0.65-0.87). Participants reported the experience comfortable and shared their feelings. MDS is a feasible, useful, effective, and interpretable solution for RPM in real-world clinical depression, anxiety, and suicidal populations. Facial information is more informative for anxiety classification, while speech and language are more discriminative of depression and suicidality markers. In general, combining speech, language, and facial information improved model performance on all classification tasks.

Read full abstract

Multimodal Dialogue System Research Articles

Related Topics

Articles published on Multimodal Dialogue System

Domain-aware Multimodal Dialog Systems with Distribution-based User Characteristic Modeling

Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements

A multimodal dialogue system for customer service based on user personality adaptation and dialogue strategies

Two-Step Masked Language Model for Domain-Adapting Multi-Modal Task-Oriented Dialogue Systems

Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

A multimodal dialog approach to mental state characterization in clinically depressed, anxious, and suicidal populations.

Aoba_v3 bot: a multimodal chatbot system combining rules and various response generation models

A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System

A multimodal dialogue system for improving user satisfaction via knowledge-enriched response and image recommendation

Transformer-Based Multimodal Infusion Dialogue Systems

감정에 기반한 가상인간의 대화 및 표정 실시간 생성 시스템 구현

A spatially-aware dialogue system for immersive classrooms

On the Gap between Domestic Robotic Applications and Computational Intelligence

Aspect-Aware Response Generation for Multimodal Dialogue System

Personalized weather information for low-literate farmers using multimodal dialog systems

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

EmoSen: Generating Sentiment and Emotion Controlled Responses in a Multimodal Dialogue System

Converness: Ontology‐driven conversational awareness and context understanding in multimodal dialogue systems

A Multimodal Dialog System for Language Assessment: Current State and Future Directions

Using Vision and Speech Features for Automated Prediction of Performance Metrics in Multimodal Dialogs

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multimodal Dialogue System Research Articles

Related Topics

Articles published on Multimodal Dialogue System

Domain-aware Multimodal Dialog Systems with Distribution-based User Characteristic Modeling

Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements

A multimodal dialogue system for customer service based on user personality adaptation and dialogue strategies

Two-Step Masked Language Model for Domain-Adapting Multi-Modal Task-Oriented Dialogue Systems

Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

A multimodal dialog approach to mental state characterization in clinically depressed, anxious, and suicidal populations.

Aoba_v3 bot: a multimodal chatbot system combining rules and various response generation models

A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System

A multimodal dialogue system for improving user satisfaction via knowledge-enriched response and image recommendation

Transformer-Based Multimodal Infusion Dialogue Systems

감정에 기반한 가상인간의 대화 및 표정 실시간 생성 시스템 구현

A spatially-aware dialogue system for immersive classrooms

On the Gap between Domestic Robotic Applications and Computational Intelligence

Aspect-Aware Response Generation for Multimodal Dialogue System

Personalized weather information for low-literate farmers using multimodal dialog systems

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

EmoSen: Generating Sentiment and Emotion Controlled Responses in a Multimodal Dialogue System

Converness: Ontology‐driven conversational awareness and context understanding in multimodal dialogue systems

A Multimodal Dialog System for Language Assessment: Current State and Future Directions

Using Vision and Speech Features for Automated Prediction of Performance Metrics in Multimodal Dialogs