Abstract

Training the generative models with minimal corpus is one of the critical challenges for building open-domain dialogue systems. Existing methods tend to use the meta-learning framework which pre-trains the parameters on all non-target tasks then fine-tunes on the target task. However, fine-tuning distinguishes tasks from the parameter perspective but ignores the model-structure perspective, resulting in similar dialogue models for different tasks. In this paper, we propose an algorithm that can customize a unique dialogue model for each task in the few-shot setting. In our approach, each dialogue model consists of a shared module, a gating module, and a private module. The first two modules are shared among all the tasks, while the third one will differentiate into different network structures to better capture the characteristics of the corresponding task. The extensive experiments on two datasets show that our method outperforms all the baselines in terms of task consistency, response quality, and diversity.

Highlights

  • Generative dialogue models often require a large amount of dialogues for training, and it is challenging to build models that can adapt to new domains or tasks with limited data

  • We propose Customized Model Agnostic Meta-Learning algorithm (CMAML), which is able to customize unique dialogue models for different tasks

  • CMAML introduces a private network for each task’s dialogue model, whose structure will evolve during the training to better fit the characteristics of this task

Read more

Summary

Introduction

Generative dialogue models often require a large amount of dialogues for training, and it is challenging to build models that can adapt to new domains or tasks with limited data. While pre-training is beneficial, such models still require sufficient taskspecific data for fine-tuning. They cannot achieve satisfying performance when very few examples. Take building personalized dialogue models as an example, previous work treats learning dialogues with different personas as different tasks [Madotto et al, 2019; Qian and Yu, 2019]. They employ MAML to find an initialization of model parameters by maximizing the sensitivity of the loss function when applied to new tasks. Its dialogue model is obtained by finetuning the initial parameters from MAML with its task-specific training samples

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.