Abstract
The rapid growth of information technology and communication technology makes the volume of information available on the web increase rapidly. This development is leading to information overload. Multidocument summarization appears as a way to resolve the information overload problem in an effective way. In order to improve the performance of the multi-document summary this research combined the sentence features: sentence centroid, sentence position, sentence length and IsTheLongestSentence value to weight the sentences in order to find the most informative information of a text. In addition, this research uses a new method to calculate the weight of sentence position feature. The performance of the research result was evaluated using ROUGE metrics: ROUGE-N, ROUGE-L, ROUGE-W, ROUGE-S, and ROUGE-SU. The research result outperform MEAD system if it was evaluated using the dataset of cluster D133C and D134H and if it was evaluated using ROUGE-1, ROUGE-S and ROUGE SU for cluster D133C and ROUGE-2, ROUGE-3, ROUGE-4, ROUGE-L and ROUGE-W for cluster D134H. This shows that the research result captures the important words in the extracted summary and it generates longer sentences as longer sentence contains more material that would match the one in the reference summaries.
 Index Terms— multi-document summarization, document features, centroid based summarization
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.