Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Yiping Song,Ming Zhang,Zequn Liu,Wei Bi,Rui Yan

doi:10.18653/v1/2020.acl-main.517

Abstract

Training the generative models with minimal corpus is one of the critical challenges for building open-domain dialogue systems. Existing methods tend to use the meta-learning framework which pre-trains the parameters on all non-target tasks then fine-tunes on the target task. However, fine-tuning distinguishes tasks from the parameter perspective but ignores the model-structure perspective, resulting in similar dialogue models for different tasks. In this paper, we propose an algorithm that can customize a unique dialogue model for each task in the few-shot setting. In our approach, each dialogue model consists of a shared module, a gating module, and a private module. The first two modules are shared among all the tasks, while the third one will differentiate into different network structures to better capture the characteristics of the corresponding task. The extensive experiments on two datasets show that our method outperforms all the baselines in terms of task consistency, response quality, and diversity.

Highlights

Generative dialogue models often require a large amount of dialogues for training, and it is challenging to build models that can adapt to new domains or tasks with limited data
We propose Customized Model Agnostic Meta-Learning algorithm (CMAML), which is able to customize unique dialogue models for different tasks
CMAML introduces a private network for each task’s dialogue model, whose structure will evolve during the training to better fit the characteristics of this task

Summary

Introduction

Generative dialogue models often require a large amount of dialogues for training, and it is challenging to build models that can adapt to new domains or tasks with limited data. While pre-training is beneficial, such models still require sufficient taskspecific data for fine-tuning. They cannot achieve satisfying performance when very few examples. Take building personalized dialogue models as an example, previous work treats learning dialogues with different personas as different tasks [Madotto et al, 2019; Qian and Yu, 2019]. They employ MAML to find an initialization of model parameters by maximizing the sensitivity of the loss function when applied to new tasks. Its dialogue model is obtained by finetuning the initial parameters from MAML with its task-specific training samples

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 41	License type: cc-by

Similar Papers

A Unified Meta-Learning Framework for Dynamic Transfer Learning
Jun Wu ... Jingrui He
-
Jun Wu, et. al.Jun Wu ... Jingrui He
01 Jul 2022
01 Jul 2022

A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis
...
-
, et. al. ...
27 Jun 2022
27 Jun 2022

A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis
Ehsan Hosseini-Asl ... Caiming Xiong
-
Ehsan Hosseini-Asl, et. al.Ehsan Hosseini-Asl ... Caiming Xiong
01 Jan 2021
01 Jan 2021

Adaptable Text Matching via Meta-Weight Regulator
Bo Zhang ... Dawei Song
-
Bo Zhang, et. al.Bo Zhang ... Dawei Song
06 Jul 2022
06 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers