Abstract

Extractive summarization for academic articles in natural sciences and medicine has attracted attention for a long time. However, most existing extractive summarization models often process academic articles with sentence classification models, which are hard to produce comprehensive summaries. To address this issue, we explore a new view to solve the extractive summarization of academic articles in natural sciences and medicine by taking it as a question-answering process. We propose a novel framework, MRC-Sum, where the extractive summarization for academic articles in natural sciences and medicine is cast as an MRC (Machine Reading Comprehension) task. To instantiate MRC-Sum, article-summary pairs in the summarization datasets are firstly reconstructed into (Question, Answer, Context) triples in the MRC task. Several questions are designed to cover the main aspects (e.g. Background, Method, Result, Conclusion) of the articles in natural sciences and medicine. A novel strategy is proposed to solve the problem of the non-existence of the ground truth answer spans. Then MRC-Sum is trained on the reconstructed datasets and large-scale pre-trained models. During the inference stage, four answer spans of the predefined questions are given by MRC-Sum and concatenated to form the final summary for each article. Experiments on three publicly available benchmarks, i.e., the Covid, PubMed, and arXiv datasets, demonstrate the effectiveness of MRC-Sum. Specifically, MRC-Sum outperforms advanced extractive summarization baselines on the Covid dataset and achieves competitive results on the PubMed and arXiv datasets. We also propose a novel metric, COMPREHS, to automatically evaluate the comprehensiveness of the system summaries for academic articles in natural sciences and medicine. Abundant experiments are conducted and verified the reliability of the proposed metric. And the results of the COMPREHS metric show that MRC-Sum is able to generate more comprehensive summaries than the baseline models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.