Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model

Jiin Nam,Kyomin Jung,Seunghyun Yoon

doi:10.18653/v1/w19-5043

Jiin Nam, Kyomin Jung + Show 1 more

Open Access

PDF Available

https://doi.org/10.18653/v1/w19-5043

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2019
Citations: 2	License type: cc-by

Affiliation: Samsung (South Korea), Seoul National University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

While deep learning techniques have shown promising results in many natural language processing (NLP) tasks, it has not been widely applied to the clinical domain. The lack of large datasets and the pervasive use of domain-specific language (i.e. abbreviations and acronyms) in the clinical domain causes slower progress in NLP tasks than that of the general NLP tasks. To fill this gap, we employ word/subword-level based models that adopt large-scale data-driven methods such as pre-trained language models and transfer learning in analyzing text for the clinical domain. Empirical results demonstrate the superiority of the proposed methods by achieving 90.6% accuracy in medical domain natural language inference task. Furthermore, we inspect the independent strengths of the proposed approaches in quantitative and qualitative manners. This analysis will help researchers to select necessary components in building models for the medical domain.

Highlights

Natural language processing (NLP) has broadened its applications rapidly in recent years such as question answering, neural machine translation, natural language inference, and other languagerelated tasks
We explore three kinds of BioBERT that are fine-tuned from the original BERT with PubMed Central full-text articles (PMC), PubMed, and PMC+PubMed datasets
We study natural language inference in the clinical domain where training corpora is insufficient due to its domain nature

Summary

Introduction

Natural language processing (NLP) has broadened its applications rapidly in recent years such as question answering, neural machine translation, natural language inference, and other languagerelated tasks. Unlike other tasks in NLP area, the lack of large labeled datasets and restricted access in the clinical domain have discouraged active participation of NLP researchers for this domain (Romanov and Shivade, 2018). The pervasive use of abbreviations and acronyms in the clinical domain causes the difficulty of text normalization and makes the related tasks more difficult (Pakhomov, 2002). It has been shown that the pre-trained language models by using a huge diversity of corpus (i.e. BERT (Devlin et al, 2018) and ELMo (Peters et al, 2018)) generate deep contextualized word representations These methods have shown to be very effective for improving the performance of a wide range of NLP tasks by enabling better text understanding and have become a crucial part of the tasks since they have published

Methods

Results

Conclusion