Abstract

Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems. Current IC/SF models perform poorly when the number of training examples per class is small. We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios. We establish a few-shot IC/SF benchmark by defining few-shot splits for three public IC/SF datasets, ATIS, TOP, and Snips. We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark. Prototypical networks achieves significant gains in IC performance on the ATIS and TOP datasets, while both prototypical networks and MAML outperform the baseline with respect to SF on all three datasets. In addition, we demonstrate that joint training as well as the use of pre-trained language models, ELMo and BERT in our case, are complementary to these few-shot learning methods and yield further gains.

Highlights

  • In the context of goal-oriented dialogue systems, intent classification (IC) is the process of classifying a user’s utterance into an intent, such as BookFlight or AddToPlaylist, referring to the user’s goal

  • ELMO + joint training baseline of up to 6% IC accuracy and 43 slot F1 points for Kmax = 20, and 14% IC accuracy and 45 slot F1 points for Kmax = 100

  • We show that few-shot learning techniques can substantially improve IC/slot filling (SF) performance in ultra low resource scenarios

Read more

Summary

Introduction

In the context of goal-oriented dialogue systems, intent classification (IC) is the process of classifying a user’s utterance into an intent, such as BookFlight or AddToPlaylist, referring to the user’s goal. Most state-of-the-art IC/SF models are based on feed-forward, convolutional, or recurrent neural networks (Hakkani-Tur et al, 2016; Goo et al, 2018; Gupta et al, 2019). These neural models offer substantial gains in performance, but they often require a large number of labeled examples (on the order of hundreds) per intent class and slot-label to achieve these gains. The relative scarcity of large-scale datasets annotated with intents and slots prohibits the use of neural IC/SF models in many promising domains, such as medical consultation, where it is difficult to obtain large quantities of annotated dialogues

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call