Abstract

AbstractAutomatic text summarization has been an growingly important task since a huge amount of textual information needs to be processed on the Internet. Genetic Algorithm (GA) is an efficient approach for extractive text summarization, which aims to find out the best summary with an optimized fitness function through the evolution of generations. This paper proposes a novel extractive summarization method using GA with two types of individuals based on their internal chromosomal structure. Each individual may have one or two full chromosomes, where a chromosome represents a candidate summary. In this type-based GA, good summaries are better kept through generations, the mutation more likely happens with more flexible strategies and prominent summaries are more likely found in the solution space. The mutation can occur in two levels: off-springs can be obtained by changing their parents’ type or flipping some genes, i.e. multi-point, in their parents’ chromosomes. Our proposed approach has been experimented on DUC2001, DUC2002 and CNN/DailyMail datasets, outperforming all other extractive state-of-the-art methods by all three Rouge points. Indeed, the Rouge-1 and Rouge-L scores considerably improve from 10% to 20%, while the Rouge-2 has the highest performance.KeywordsGenetic algorithmExtractive summarization

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call