STV-BEATS: Skip Thought Vector and Bi-Encoder based Automatic Text Summarizer

Minakshi Tomer,Manoj Kumar

doi:10.1016/j.knosys.2021.108108

Abstract

A novel text summarization framework referred to as Skip-Though Vector and Bi-encoder Based Automatic Text Summarization (STV–BEATS) is proposed in this paper. STV–BEATS utilizes — (a) skip-though vector to generate sentence-based embedding; and (b) Long Short-Term Memory (LSTM) based deep autoencoder to reduce dimensions of skip thought vectors. STV–BEATS works in the conjunction of extractive and abstractive summarization models to enhance the overall quality of the results. For each sentence, relevance and novelty metrics are calculated on the intermediate representation of the deep autoencoder to generate the final sentence score. The highly scored sentences are selected to generate an extractive summary. On the other hand, the abstractive part is composed of two encoders and a decoder which works as — (a) the first GRU-based bi-directional encoder and decoder work as basic sequence-to-sequence model on the extractive summary; and (b) the second GRU-based unidirectional encoder is used for fine encoding. Extensive computer experiments are conducted to determine the effectiveness of the STV–BEATS. Three standard benchmark datasets, namely, CNN/Daily Mail, DUC-2004, and DUC-2002 are used during experiments. Further, recall-oriented understudy for gisting evaluation (ROUGE) is used for validation of the STV–BEATS. Result reveals that the proposed STV–BEATS is capable of effective text summarization and achieves substantially better results over the state-of-the-art models.

Full Text