Abstract
This paper describes Macquarie University’s contribution to the BioASQ Challenge (BioASQ 6b, Phase B). We focused on the extraction of the ideal answers, and the task was approached as an instance of query-based multi-document summarisation. In particular, this paper focuses on the experiments related to the deep learning and reinforcement learning approaches used in the submitted runs. The best run used a deep learning model under a regression-based framework. The deep learning architecture used features derived from the output of LSTM chains on word embeddings, plus features based on similarity with the query, and sentence position. The reinforcement learning approach was a proof-of-concept prototype that trained a global policy using REINFORCE. The global policy was implemented as a neural network that used tf.idf features encoding the candidate sentence, question, and context.
Highlights
The BioASQ Challenge1 consists of various tasks related to biomedical semantic indexing and question answering (Tsatsaronis et al, 2015)
We observe some differences during training, but in general the best model on the test set achieved a ROUGE score between 0.25 and 0.26.2 This is higher than the results reported by Molla (2017b), who reported a ROUGE score of about 0.2
In this paper we have described the deep learning and reinforcement learning approaches used for the runs submitted to BioASQ 6b, phase B, for the generation of ideal answers
Summary
The BioASQ Challenge consists of various tasks related to biomedical semantic indexing and question answering (Tsatsaronis et al, 2015). Our participation in BioASQ for 2018 focused on Task B Phase B, where our system attempted to find the ideal answer given a question and a collection of relevant snippets of text. We approached this task as an instance of query-based multi-document summarisation, where the ideal answer is the summary to produce. The BioASQ challenge focuses on a restricted domain, namely biomedical literature. The techniques developed for our system were domain-agnostic and can be applied to any domain, provided that the domain has enough training data and a specialised corpus large enough to train word embeddings
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.