Abstract
Medical dialogue systems are promising in assisting in telemedicine to increase access to healthcare services, improve the quality of patient care, and reduce medical costs. To facilitate the research and development of medical dialogue systems, we build large-scale medical dialogue datasets -- MedDialog, which contain 1) a Chinese dataset with 3.4 million conversations between patients and doctors, 11.3 million utterances, 660.2 million tokens, covering 172 specialties of diseases, and 2) an English dataset with 0.26 million conversations, 0.51 million utterances, 44.53 million tokens, covering 96 specialties of diseases. To our best knowledge, MedDialog is the largest medical dialogue dataset to date. We pretrain several dialogue generation models on the Chinese MedDialog dataset, including Transformer, GPT, BERT-GPT, and compare their performance. It is shown that models trained on MedDialog are able to generate clinically correct and doctor-like medical dialogues. We also study the transferability of models trained on MedDialog to low-resource medical dialogue generation tasks. It is shown that via transfer learning which finetunes the models pretrained on MedDialog, the performance on medical dialogue generation tasks with small datasets can be greatly improved, as shown in human evaluation and automatic evaluation. The datasets and code are available at https://github.com/UCSD-AI4H/Medical-Dialogue-System
Highlights
Telemedicine refers to the practice of delivering patient care remotely, where doctors provide medical consultations to patients using HIPAA compliant video-conferencing tools
Through human evaluation and automatic evaluation, we show that the pretrained models on MedDialog-CN can significantly improve performance on medical dialogue generation tasks where the dataset size is small, via transfer learning
Given a dialogue containing a sequence of alternating utterances between patient and doctor, we process it into a set of pairs {(si, ti)} where the target ti is a response from the doctor and the source si is the concatenation of all utterances before ti
Summary
Telemedicine refers to the practice of delivering patient care remotely, where doctors provide medical consultations to patients using HIPAA compliant video-conferencing tools. To address the limitations of existing datasets, we build large-scale medical dialogue datasets – MedDialog – that contain 1) a Chinese dataset with 3.4 million conversations between patients and doctors, 11.3 million utterances, 660.2 million tokens, covering 172 specialties of diseases, and 2) an English dataset with 0.26 million conversations, 0.51 million utterances, 44.53 million tokens, covering 96 specialties of diseases. We pretrain several dialogue generation models on the Chinese MedDialog dataset, including Transformer, BERT-GPT, and GPT, and compare their performance using automatic metrics. Through human evaluation and automatic evaluation, we show that the pretrained models on MedDialog-CN can significantly improve performance on medical dialogue generation tasks where the dataset size is small, via transfer learning. # dialogues # utterances # tokens Avg. # of utterances in a dialogue Max # of utterances in a dialogue Min # of utterances in a dialogue Avg. # of tokens in an utterance Max # of tokens in an utterance Min # tokens in an utterance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.