Indonesian Abstractive Summarization using Pre-trained Model

Rini Wijayanti,Dwi H Widyantoro,Masayu L Khodra

doi:10.1109/eiconcit50028.2021.9431880

Abstract

Automatic text summarization systems are increasingly needed to encounter the information explosion caused by internet growth. Since Indonesian is still considered an under-resourced language, we take advantage of pre-trained language models to perform abstractive summarization. This paper investigates the BERT performance given the Indonesian article by comparing several BERT pre-trained models and evaluated the results based on the ROUGE values. Our experiment shows that an English pre-trained model can produce a good summary given Indonesian text, but it is more effective for using the Indonesian pre-trained model. The default training model only with the abstractive objective is better than using two-stage fine-tuning, where the extractive model must be trained in advance. We also found a lot of meaningless words in the summary words construction. This finding is the result of a preliminary study to improve the Indonesian abstractive summarization model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Indonesian Abstractive Summarization using Pre-trained Model

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The impact analysis of language differences on an automatic multilingual text summarization system
Fu Lee Wang ... Christopher C Yang
Journal of the American Society for Information Science and Technology | VOL. 57
Fu Lee Wang, et. al.Fu Lee Wang ... Christopher C Yang
01 Feb 2006
Journal of the American Society for Information Science and Technology | VOL. 57

SGCSumm: An extractive multi-document summarization method based on pre-trained language model, submodularity, and graph convolutional neural networks
Alireza Ghadimi ... Hamid Beigy
Expert Systems with Applications | VOL. 215
Alireza Ghadimi, et. al.Alireza Ghadimi ... Hamid Beigy
22 Nov 2022
Expert Systems with Applications | VOL. 215

A Comparison of Pre-Trained Language Models for Multi-Class Text Classification in the Financial Domain
Yusuf Arslan ... Anne Goujon
-
Yusuf Arslan, et. al.Yusuf Arslan ... Anne Goujon
19 Apr 2021
19 Apr 2021

Progress in protein pre-training models integrating structural knowledge
Tian-Yi Tang ... Yi-Ming Xiong
Acta Physica Sinica | VOL. 73
Tian-Yi Tang, et. al.Tian-Yi Tang ... Yi-Ming Xiong
01 Jan 2024
Acta Physica Sinica | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Indonesian Abstractive Summarization using Pre-trained Model

Abstract

Talk to us

Similar Papers