Abstract

BackgroundBiomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot of approaches based on the neural network and large scale pre-trained language model have largely improved its performance. However, considering the lexical characteristics of biomedical corpus and its small scale dataset, there is still much improvement room for biomedical QA tasks.ResultsInspired by the importance of syntactic and lexical features in the biomedical corpus, we proposed a new framework to extract external features, such as part-of-speech and named-entity recognition, and fused them with the original text representation encoded by pre-trained language model, to enhance the biomedical question answering performance. Our model achieves an overall improvement of all three metrics on BioASQ 6b, 7b, and 8b factoid question answering tasks.ConclusionsThe experiments on BioASQ question answering dataset demonstrated the effectiveness of our external feature-enriched framework. It is proven by the experiments conducted that external lexical and syntactic features can improve Pre-trained Language Model’s performance in biomedical domain question answering task.

Highlights

  • Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information

  • We focused on extractive question answering task in biomedical domain and proposed a framework to extract external syntactic and lexical features, such as POS and general named-entity recognition (NER), and to fuse these auxiliary features into the sentence representation encoded by pre-trained language model in order to enrich the model with more syntactic information, emphasize the lexical representation of biomedical text, enhance the matching degree between question and passages and bridge the representation gap between general and domain corpus without disturbing the pre-trained language models (PLM) performance

  • BioASQ comprises two main tasks, task A is about the annotation of new biomedical documents from PubMed, a free search engine for life science and biomedical references, with MESH headings; task B consists of several biomedical semantic QA tasks, including information retrieval, multi-type question answering, and summarization tasks

Read more

Summary

Introduction

Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. With the development of network technology and the accumulation of big data, more and more healthcare services have appeared, including online medical information retrieval and biomedical question answering applications, which can help people seek health information and biomedical knowledge quickly and economically [1]. Among these healthcare application scenarios, biomedical question answering technology, a sub-task of natural language processing in the biomedical domain which could locate and extract required biomedical text spans, is a basic and useful method for knowledge retrieval and representation. It is useful and highly applicable since it could provide a reliable answer to the users among many related biomedical passages and act as the last step in the automatic biomedical QA system and some healthcare services

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call