Abstract

In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers. Given the context of a citing paper A and a cited paper B, the task aims to generate a short text to describe B in the given context of A. One big challenge for addressing this task is the lack of training data. Usually, explicit citation texts are easy to extract, but it is not easy to extract implicit citation texts from scholarly papers. We thus first train an implicit citation extraction model based on BERT and leverage the model to construct a large training dataset for the citation text generation task. Then we propose and train a multi-source pointer-generator network with cross attention mechanism for citation text generation. Empirical evaluation results on a manually labeled test dataset verify the efficacy of our model. This pilot study confirms the feasibility of automatically generating citation texts in scholarly papers and the technique has the great potential to help researchers prepare their scientific papers.

Highlights

  • A scientific paper usually needs to cite a lot of reference papers and introduce each reference paper with some text

  • In order to reduce the burden of researchers, we propose and try to address the task of automatic citation text generation

  • We propose a new task of automatic citation text generation in scholarly papers

Read more

Summary

Introduction

A scientific paper usually needs to cite a lot of reference papers and introduce each reference paper with some text. Given a cited paper B and the context in a citing paper A (i.e., the sentences before and after a specific position in paper A), the task aims to generate a short text to describe B with respect to the given context in A. The task is like the task of scholarly paper summarization (Luhn, 1958; Edmundson, 1969; Qazvinian and Radev, 2008; Mei and Zhai, 2008) Both of the two tasks aim to produce a text to describe the cited paper B. Sometimes one paper may cite another paper several times in different positions but give different descriptions because the specific contexts are different. Another difference between the two tasks is the length of the text. The difficulty lies in that given different A or different contexts of A, the task aims to produce different citation texts for the same B

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.