Abstract

Text summarization is one of the solution for information overload. Reducing text without losing the meaning not only can save time to read, but also maintain the reader’s understanding. One of many algorithms to summarize text is TextTeaser. Originally, this algorithm is intended to be used for text in English. However, due to TextTeaser algorithm does not consider the meaning of the text, we implement this algorithm for text in Indonesian language. This algorithm calculates four elements, such as title feature, sentence length, sentence position and keyword frequency. We utilize TextRank, an unsupervised and language independent text summarization algorithm, to evaluate the summarized text yielded by TextTeaser. The result shows that the TextTeaser algorithm needs more improvement to obtain better accuracy.

Highlights

  • Automatic Text Summarization (ATS) is the process to reduce text in order to obtain important sentences by the machine which implements particular algorithm or method

  • Extraction-based summarization method extracts important sentences in article and unify them into one summary, the sentences yielded by this algorithm are part of the original text without modification

  • Recall in the context of automatic text summarization is number of sections of text that is relevant with the original text based on the number of all sentences in the text

Read more

Summary

Introduction

Automatic Text Summarization (ATS) is the process to reduce text in order to obtain important sentences by the machine which implements particular algorithm or method. One method to produce text summary is extraction-based summarization. Extraction-based summarization method extracts important sentences in article and unify them into one summary, the sentences yielded by this algorithm are part of the original text without modification. This method is implemented by TextTeaser algorithm to summarize text. This algorithm is available freely as open source and can be downloaded from github repository. This algorithm calculates text feature, sentence length, sentence position and keyword frequency. We evaluate the TextTeaser performance compared to TextRank algorithm

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call