MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization

Yue Cao,Jinge Yao,Xiaojun Wan,Dian Yu

doi:10.1609/aaai.v34i01.5328

Abstract

Automatic text summarization aims at producing a shorter version of the input text that conveys the most important information. However, multi-lingual text summarization, where the goal is to process texts in multiple languages and output summaries in the corresponding languages with a single model, has been rarely studied. In this paper, we present MultiSumm, a novel multi-lingual model for abstractive summarization. The MultiSumm model uses the following training regime: (I) multi-lingual learning that contains language model training, auto-encoder training, translation and back-translation training, and (II) joint summary generation training. We conduct experiments on summarization datasets for five rich-resource languages: English, Chinese, French, Spanish, and German, as well as two low-resource languages: Bosnian and Croatian. Experimental results show that our proposed model significantly outperforms a multi-lingual baseline model. Specifically, our model achieves comparable or even better performance than models trained separately on each language. As an additional contribution, we construct the first summarization dataset for Bosnian and Croatian, containing 177,406 and 204,748 samples, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 13

Similar Papers

Multilingual Dependency Parsing for Low-Resource African Languages: Case Studies on Bambara, Wolof, and Yoruba
Cheikh M Bamba Dione
-
Cheikh M Bamba DioneCheikh M Bamba Dione
01 Jan 2020
01 Jan 2020

Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models
Chanhee Lee ... Taesun Whang
Applied Sciences | VOL. 11
Chanhee Lee, et. al.Chanhee Lee ... Taesun Whang
24 Feb 2021
Applied Sciences | VOL. 11

Adapting Multilingual Neural Machine Translation to Unseen Languages
...
-
, et. al. ...
30 Oct 2019
30 Oct 2019

Automatic Extractive Text Summarization using Multiple Linguistic Features
Pooja Gupta ... Swati Nigam
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -
Pooja Gupta, et. al.Pooja Gupta ... Swati Nigam
08 Apr 2024
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence