Abstractive text summarization is used to generate summaries that are similar to human-written summaries. To achieve that capability, recurrent deep learning architectures were usually applied, such as RNN, LSTM, and GRU. In previous studies about abstractive text summarization in Indonesian, recurrent models were widely used and there were cohesion and grammar errors that appeared in the generated summaries—this could have an impact on performance. Currently, there is Transformer, a relatively new architecture that relies on the attention mechanism entirely. Due to its non-recurrent nature, Transformer overcomes the problem of dependency the on hidden states that occurs in recurrent models and it can retain information on all input sequences. In this study, we use Transformer to evaluate how good it is at abstractive text summarization in Indonesian. The training was conducted using the pre-trained T5 model with IndoSum dataset which contains around 19K news-summary pairs. We achieved evaluation scores of 0.61 ROUGE-1 and 0.51 ROUGE-2.
Read full abstract