Neural Domain Adaptation for Biomedical Question Answering

Georg Wiese,Mariana Neves,Dirk Weissenborn

doi:10.18653/v1/k17-1029

Georg Wiese, Mariana Neves + Show 1 more

Open Access

PDF Available

https://doi.org/10.18653/v1/k17-1029

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2017
Citations: 81	License type: cc-by

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (ca. 100,000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally too small to train a DL system from scratch. For example, the BioASQ dataset for biomedical QA comprises less then 900 factoid (single answer) and list (multiple answers) QA instances. In this work, we adapt a neural QA system trained on a large open-domain dataset (SQuAD, source) to a biomedical dataset (BioASQ, target) by employing various transfer learning techniques. Our network architecture is based on a state-of-the-art QA system, extended with biomedical word embeddings and a novel mechanism to answer list questions. In contrast to existing biomedical QA systems, our system does not rely on domain-specific ontologies, parsers or entity taggers, which are expensive to create. Despite this fact, our systems achieve state-of-the-art results on factoid questions and competitive results on list questions.

Highlights

Question answering (QA) is the task of retrieving answers to a question given one or more contexts
We further restrict our focus to extractive QA, i.e., QA instances where the correct answers can be represented as spans in the contexts
We show that mere fine-tuning reaches state-of-the-art results, which can further be improved by a forgetting cost regularization (Riemer et al, 2017)

Summary

Introduction

Question answering (QA) is the task of retrieving answers to a question given one or more contexts. It has been explored both in the opendomain setting (Voorhees et al, 1999) as well as domain-specific settings, such as BioASQ for the biomedical domain (Tsatsaronis et al, 2015). The BioASQ challenge provides ≈ 900 factoid and list questions, i.e., questions with one and several answers, respectively. This work focuses on answering these questions, for example: Which drugs are included in the FEC-75 regimen? We further restrict our focus to extractive QA, i.e., QA instances where the correct answers can be represented as spans in the contexts. Contexts are relevant documents which are provided by an information retrieval (IR) system

Methods

Results

Discussion

Conclusion