Abstract
While deep learning techniques have shown promising results in many natural language processing (NLP) tasks, it has not been widely applied to the clinical domain. The lack of large datasets and the pervasive use of domain-specific language (i.e. abbreviations and acronyms) in the clinical domain causes slower progress in NLP tasks than that of the general NLP tasks. To fill this gap, we employ word/subword-level based models that adopt large-scale data-driven methods such as pre-trained language models and transfer learning in analyzing text for the clinical domain. Empirical results demonstrate the superiority of the proposed methods by achieving 90.6% accuracy in medical domain natural language inference task. Furthermore, we inspect the independent strengths of the proposed approaches in quantitative and qualitative manners. This analysis will help researchers to select necessary components in building models for the medical domain.
Highlights
Natural language processing (NLP) has broadened its applications rapidly in recent years such as question answering, neural machine translation, natural language inference, and other languagerelated tasks
We explore three kinds of BioBERT that are fine-tuned from the original BERT with PubMed Central full-text articles (PMC), PubMed, and PMC+PubMed datasets
We study natural language inference in the clinical domain where training corpora is insufficient due to its domain nature
Summary
Natural language processing (NLP) has broadened its applications rapidly in recent years such as question answering, neural machine translation, natural language inference, and other languagerelated tasks. Unlike other tasks in NLP area, the lack of large labeled datasets and restricted access in the clinical domain have discouraged active participation of NLP researchers for this domain (Romanov and Shivade, 2018). The pervasive use of abbreviations and acronyms in the clinical domain causes the difficulty of text normalization and makes the related tasks more difficult (Pakhomov, 2002). It has been shown that the pre-trained language models by using a huge diversity of corpus (i.e. BERT (Devlin et al, 2018) and ELMo (Peters et al, 2018)) generate deep contextualized word representations These methods have shown to be very effective for improving the performance of a wide range of NLP tasks by enabling better text understanding and have become a crucial part of the tasks since they have published
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have