Abstract

Following some recent propositions to handle natural language generation in spoken dialogue systems with long short-term memory recurrent neural network models~\citep{Wen2016a} we first investigate a variant thereof with the objective of a better integration of the attention subnetwork. Then our next objective is to propose and evaluate a framework to adapt the NLG module online through direct interactions with the users. When doing so the basic way is to ask the user to utter an alternative sentence to express a particular dialogue act. But then the system has to decide between using an automatic transcription or to ask for a manual transcription. To do so a reinforcement learning approach based on an adversarial bandit scheme is retained. We show that by defining appropriately the rewards as a linear combination of expected payoffs and costs of acquiring the new data provided by the user, a system design can balance between improving the system's performance towards a better match with the user's preferences and the burden associated with it. Then the actual benefits of this system is assessed with a human evaluation, showing that the addition of more diverse utterances allows to produce sentences more satisfying for the user.

Highlights

  • In a spoken dialogue system, the Natural Language Generation (NLG) component aims to produce an utterance from a system Dialogue Act (DA) decided by the dialogue manager

  • As in the attention-based encoder-decoder, the decoding process is performed through a standard Long Short-Term Memory (LSTM) (Fig. 1b), which is fed by an additional vector at representing the information on which the model currently focuses (Fig. 1a). at is called the local DA embedding with attention

  • In this paper we have investigated an attention-based neural network for natural language generation, combining two systems proposed by Wen et al.: the Semantically Conditioned LSTM-based model (SCLSTM) and the Recurrent Neural Networks (RNNs) encoder-decoder architecture with an attention mechanism

Read more

Summary

Introduction

In a spoken dialogue system, the Natural Language Generation (NLG) component aims to produce an utterance from a system Dialogue Act (DA) decided by the dialogue manager. This work is in line with previous studies showing that the transfer between texts and DAs can be directly handled by a general language translation approach (Jabaian et al, 2016) or inverted semantic parsers (Konstas and Lapata, 2013) In all these cases, a difficulty remains: a huge amount of data is required. It should be noted that at this critical step of development, users should still be under the control of the designers (they can be designers themselves or colleagues), as it can be hazardous to let the general public directly access such a functionality without any efficient means to counterbalance the effect of the on-line adaptation This difficult and sensitive point will be addressed more thoroughly in future work.

Related work
A Combined-Context LSTM for language generation
Model description
Comparison with the reference models
Training and decoding
On-line interactive problem
Static case
Adversarial bandit case
Experimental study
System comparison
On-line adaptation evaluation
Human evaluation
On-line adaptation using real ASR data evaluation
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call