Abstract
AbstractText summarization is a process of extracting salient information from a source text and presenting that information to the user in a condensed form while preserving its main content. In the text summarization, most of the difficult problems are providing wide topic coverage and diversity in a summary. Research based on clustering, optimization, and evolutionary algorithm for text summarization has recently shown good results, making this a promising area. In this paper, for a text summarization, a two‐stage sentences selection model based on clustering and optimization techniques, called COSUM, is proposed. At the first stage, to discover all topics in a text, the sentences set is clustered by using k‐means method. At the second stage, for selection of salient sentences from clusters, an optimization model is proposed. This model optimizes an objective function that expressed as a harmonic mean of the objective functions enforcing the coverage and diversity of the selected sentences in the summary. To provide readability of a summary, this model also controls length of sentences selected in the candidate summary. For solving the optimization problem, an adaptive differential evolution algorithm with novel mutation strategy is developed. The method COSUM was compared with the 14 state‐of‐the‐art methods: DPSO‐EDASum; LexRank; CollabSum; UnifiedRank; 0–1 non‐linear; query, cluster, summarize; support vector machine; fuzzy evolutionary optimization model; conditional random fields; MA‐SingleDocSum; NetSum; manifold ranking; ESDS‐GHS‐GLO; and differential evolution, using ROUGE tool kit on the DUC2001 and DUC2002 data sets. Experimental results demonstrated that COSUM outperforms the state‐of‐the‐art methods in terms of ROUGE‐1 and ROUGE‐2 measures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.