Abstract

We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems.

Highlights

  • We introduce DIALOGPT, a tunable gigawordscale neural network model for generation of conversational reponses, trained on Reddit data

  • We trained our DIALOGPT model on the basis of the GPT-2 (Radford et al, 2018) architecture.The GPT-2 transformer model adopts the generic transformer language model (Vaswani et al, 2017) and leverages a stack of masked multi-head selfattention layers to train on massive web-text data

  • The validation reward can be stably improved, but unlike the training under RNN architecture, we observed that reinforcement learning (RL) training converges to a degenerate locally-optimal solution, where the hypothesis repeats the source sentence and mutual information is maximized

Read more

Summary

Introduction

We introduce DIALOGPT, a tunable gigawordscale neural network model for generation of conversational reponses, trained on Reddit data. OpenAI’s GPT-2 (Radford et al, 2018), for example, has demonstrated that transformer models trained on very large datasets can capture long-term dependencies in textual data and generate text that is fluent, lexically diverse, and rich in content. Most open-domain neural response generation systems suffer from content or style inconsistency (Li et al, 2016b; Zhang et al, 2019; Gao et al, 2019c), lack of long-term contextual information (Serban et al, 2017), and blandness (Li et al, 2016a; Zhang et al, 2018; Qin et al, 2019) While these issues can be alleviated by modelling strategies designed to boost information content, a transformer-based architecture like GPT-2 (Radford et al, 2018), which uses a multi-layer self-attentive mechanism to allow fully-connected cross-attention to the full context in a computationally efficient manner, seems like a natural choice for exploring a more general solution. The DIALOGPT package contains an open-source training pipeline (data extraction/preparation and model training/evaluation) built upon the Huggingface PyTorch transformer (HuggingFace, 2019). 2

Dataset
Model Architecture
Mutual Information Maximization
Experimental Details
DSTC-7 Dialogue Generation Challenge
A New Reddit Multi-reference Dataset
Re-ranking The Response Using MMI
Generation Examples
Human Evaluation
Related work
Limitations and risks
Conclusion
A Additional Details of Human Evaluation
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call