Abstract

In this paper, we present a multimodal dialogue system combining a neural response generation mechanism, a reranking mechanism, and a rule-based avatar control mechanism. Our system was submitted to the open track at the Fifth Dialogue System Live Competition and won second place. Remarkably, our system received the best human evaluation performance for visual information control (i.e. speaking style of avatar) in the preliminary round. The assessment of the competition evaluators revealed that our system generates natural utterances appropriate to the conversational context and topic with an appealing speaking style. Through the analysis, we found that our devices, such as post-processing on speech recognition results and the final response selection method, are effective, but we also found room for improvement, such as speech recognition errors and challenges in the reranking module.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.