Abstract

Automatic text summarization plays a vital role in text retrieval. However, very little interest, as well as attention to Hindi text summarization and the wide use of hybridization of an evolutionary and swarm-based method for solving optimization problems, motivates us to develop a technique for achieving better results. We have used a combination of the Title feature, Sentence length, Sentence position, Numerical Data, Thematic word, Proper Noun and Term frequency & Inverse Sentence Frequency to find the results. The proposed BPSO-IGA comprising Binary Particle Swarm Optimization (BPSO) and an improved genetic algorithm finds a well-formatted summary from multiple documents. The method for calculating the score of the sentence is proposed initially. A combination of BPSO and IGA finds the features’ optimal scores and then is used to generate the final summary. The proposed method is compared with six well-known techniques, SummaRuNNer, ChunkRank, TGraph, UniRank, FF-SE, and MDS-ACS, on FIRE 2011 Hindi dataset. To strengthen the hypothesis that all seven features are required in the summary generation, the summary is also generated using only ONE and a combination of more than one feature. It is found that the best summary is generated when all seven features are considered. The BPSO-IGA performs best among all the methods for precision, recall, f-measure, cohesion, and non-redundancy. This research aims to design a multi-document extractive text summarizer for Hindi documents using a swarm intelligence and evolutionary algorithm-based hybrid approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call