X-SCITLDR

Sotaro Takeshita,Simone Paolo Ponzetto,Niklas Friedrich,Tommaso Green,Kai Eckert

doi:10.1145/3529372.3530938

Abstract

The number of scientific publications nowadays is rapidly increasing, causing information overload for researchers and making it hard for scholars to keep up to date with current trends and lines of work. Consequently, recent work on applying text mining technologies for scholarly publications has investigated the application of automatic text summarization technologies, including extreme summarization, for this domain. However, previous work has concentrated only on monolingual settings, primarily in English. In this paper, we fill this research gap and present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain, which enables us to train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese. We present our new X-SCITLDR dataset for multilingual summarization and thoroughly benchmark different models based on a state-of-the-art multilingual pre-trained model, including a two-stage `summarize and translate' approach and a direct cross-lingual model. We additionally explore the benefits of intermediate-stage training using English monolingual summarization and machine translation as intermediate tasks and analyze performance in zero- and few-shot scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

X-SCITLDR

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Cross-lingual extreme summarization of scholarly documents
Sotaro Takeshita ... Simone Paolo Ponzetto
International Journal on Digital Libraries | VOL. 25
Sotaro Takeshita, et. al.Sotaro Takeshita ... Simone Paolo Ponzetto
10 Aug 2023
International Journal on Digital Libraries | VOL. 25

Research on the Optimal Selection Method of Fuzzy Semantics in English Long Sentence Machine Translation
Jia Liu
-
Jia LiuJia Liu
09 Dec 2022
09 Dec 2022

Acquisition of English Corpus Machine Translation Based on Speech Recognition Technology
Chunyan Jing ... Guoying Liu
Scientific Programming | VOL. 2022
Chunyan Jing, et. al.Chunyan Jing ... Guoying Liu
21 Sep 2022
Scientific Programming | VOL. 2022

Interactive Oral English Chinese Machine Translation based on Feature Extraction Algorithm in accordance with the biotechnological advancement
Yaodan Liang
Journal of Commercial Biotechnology | VOL. 26
Yaodan LiangYaodan Liang
30 Jun 2022
Journal of Commercial Biotechnology | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

X-SCITLDR

Abstract

Talk to us

Similar Papers