Abstract

The increasing usage of the Internet and social networks has produced a significant amount of online textual data. These online textual data led to information overload and redundancy. It is important to eliminate the information redundancy and preserve the time required for reading these online textual data. Thus, there is a persistent need for an automatic text summarization system, which extract the relevant and salient information from a collection of documents, that sharing the same or related topics. Then, presenting this extracted information in a condensed form to preserve the main topics. This paper proposes an automatic, generic, and extractive Arabic multi-document summarization system. The proposed system employs the clustering-based and evolutionary multi-objective optimization methods. The clustering-based method discovers the main topics in the text, while the evolutionary multi-objective optimization method optimizes three objectives based on coverage, diversity/redundancy, and relevancy. The performance of the proposed system is evaluated using TAC 2011 and DUC 2002 datasets. The experimental results are compared using ROUGE evaluation measure. The obtained results showed the effectiveness of the proposed system compared to other peer systems. The proposed system outperformed other peer systems for all ROUGE metrics using TAC 2011. We achieved an F-measure of 38.9%, 17.7%, 35.4%, and 15.8% for Rouge-1, Rouge-2, Rouge-L, and Rouge-SU4, respectively. In addition, the proposed system with DUC 2002 dataset achieved an F-measure of 47.1%, 23.7%, 47.1%, 20.4% for Rouge-1, Rouge-2, Rouge-L, and Rouge-SU4, respectively.

Highlights

  • The significant amount of the information on the Internet, such as the news articles posted on the websites, has increased the complexity of extracting useful information

  • EXPERIMENT AND RESULTS the effectiveness of the proposed summarization approach is evaluated using a set of conducted experiments

  • It is worth mentioning that our approach performance is better than other peer systems, which it is clear from Recall-Oriented Understudy for Gisting Evaluation (ROUGE)-2 results which is bi-gram matching

Read more

Summary

Introduction

The significant amount of the information on the Internet, such as the news articles posted on the websites, has increased the complexity of extracting useful information. People find it distributive to read many articles with redundant information. It is important to have an automated summarization system, that can help in identifying the most important and salient information quickly. Automatic summarization systems have been applied for different domains including search engines, web pages, news, and all forms of online reviews. Qumsiyeh and Ng [1] proposed a query-based summarizer to enhance the web search engine results, Modaresi et al [2] presented a study that shows the effect of using query-based extractive summarization approach for media monitoring and media response analysis

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call