MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

Zhaojiang Lin,Pascale Fung,Andrea Madotto,Genta Indra Winata

doi:10.18653/v1/2020.emnlp-main.273

Abstract

In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to “carryover” the old dialogue states to the new one, we introduce Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length. We instantiate our learning framework with two pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ. Extensive experiments demonstrate that: 1) our systems establish new state-of-the-art results on end-to-end response generation, 2) MinTL-based systems are more robust than baseline methods in the low resource setting, and they achieve competitive results with only 20% training data, and 3) Lev greatly improves the inference efficiency.

Highlights

Building robust task-oriented dialogue systems is challenging due to complex system design and limited availability of human-annotated data (Wen et al, 2017; Wu et al, 2019b)
Fine tuning pre-trained language models improves a wide range of natural language processing applications (Lewis et al, 2019; Raffel et al, 2019), notably machine translation (Conneau and Lample, 2019), and personalized dialogue response generation (Wolf et al, 2019b)
We propose Minimalist Transfer Learning (MinTL), a simple yet effective transfer learning framework that allows to plug-and-play pre-trained sequence-to-sequence (Seq2Seq) models and jointly learn dialogue state tracking (DST) and dialogue response generation

Summary

Introduction

Building robust task-oriented dialogue systems is challenging due to complex system design and limited availability of human-annotated data (Wen et al, 2017; Wu et al, 2019b). Recent progress in pre-training language models has been shown to be promising in alleviating the data scarcity problem (Budzianowski and Vulic, 2019; Wu et al, 2020). Current state-of-the-art (SOTA) approaches in task-oriented dialogue rely on several tasks-specific modules, such as State Operation Predictor (Kim et al, 2019) for dialogue state tracking, and CopyNet (Gu et al, 2016) for end-to-end dialogue task completion (Lei et al, 2018; Zhang et al, 2019b). Such modules are usually absent in the pre-training stage. Tasks-specific architecture modifications are required in order to adapt pre-trained language models to different dialogue tasks

Objectives

Methods

Results

Conclusion