Abstract

Medical dialogue systems are promising in assisting in telemedicine to increase access to healthcare services, improve the quality of patient care, and reduce medical costs. To facilitate the research and development of medical dialogue systems, we build large-scale medical dialogue datasets -- MedDialog, which contain 1) a Chinese dataset with 3.4 million conversations between patients and doctors, 11.3 million utterances, 660.2 million tokens, covering 172 specialties of diseases, and 2) an English dataset with 0.26 million conversations, 0.51 million utterances, 44.53 million tokens, covering 96 specialties of diseases. To our best knowledge, MedDialog is the largest medical dialogue dataset to date. We pretrain several dialogue generation models on the Chinese MedDialog dataset, including Transformer, GPT, BERT-GPT, and compare their performance. It is shown that models trained on MedDialog are able to generate clinically correct and doctor-like medical dialogues. We also study the transferability of models trained on MedDialog to low-resource medical dialogue generation tasks. It is shown that via transfer learning which finetunes the models pretrained on MedDialog, the performance on medical dialogue generation tasks with small datasets can be greatly improved, as shown in human evaluation and automatic evaluation. The datasets and code are available at https://github.com/UCSD-AI4H/Medical-Dialogue-System

Highlights

  • Telemedicine refers to the practice of delivering patient care remotely, where doctors provide medical consultations to patients using HIPAA compliant video-conferencing tools

  • Through human evaluation and automatic evaluation, we show that the pretrained models on MedDialog-CN can significantly improve performance on medical dialogue generation tasks where the dataset size is small, via transfer learning

  • Given a dialogue containing a sequence of alternating utterances between patient and doctor, we process it into a set of pairs {(si, ti)} where the target ti is a response from the doctor and the source si is the concatenation of all utterances before ti

Read more

Summary

Introduction

Telemedicine refers to the practice of delivering patient care remotely, where doctors provide medical consultations to patients using HIPAA compliant video-conferencing tools. To address the limitations of existing datasets, we build large-scale medical dialogue datasets – MedDialog – that contain 1) a Chinese dataset with 3.4 million conversations between patients and doctors, 11.3 million utterances, 660.2 million tokens, covering 172 specialties of diseases, and 2) an English dataset with 0.26 million conversations, 0.51 million utterances, 44.53 million tokens, covering 96 specialties of diseases. We pretrain several dialogue generation models on the Chinese MedDialog dataset, including Transformer, BERT-GPT, and GPT, and compare their performance using automatic metrics. Through human evaluation and automatic evaluation, we show that the pretrained models on MedDialog-CN can significantly improve performance on medical dialogue generation tasks where the dataset size is small, via transfer learning. # dialogues # utterances # tokens Avg. # of utterances in a dialogue Max # of utterances in a dialogue Min # of utterances in a dialogue Avg. # of tokens in an utterance Max # of tokens in an utterance Min # tokens in an utterance

Related Works
The Chinese MedDialog dataset
The English MedDialog dataset
Advantages of our datasets
Methods
Dialogue Generation as Sequence-to-Sequence Modeling
Dialogue Generation as Language Modeling
Pretraining
Experimental Settings
Results
Transfer to Other Datasets
Experimental settings

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.