Abstract

Oft-decried information overload is a serious problem that negatively impacts the comprehension of information in the digital age. Text summarization is a helpful process that can be used to alleviate this problem. With the aim of seeking a novel method to enhance the performance of multi-document summarization, this study proposes a novel approach to analyze the problem of multi-document summarization based on a mixture model, consisting of a contextual topic model from a Bayesian hierarchical topic modeling family for selecting candidate summary sentences, and a regression model in machine learning for generating the summary. By investigating hierarchical topics and their correlations with respect to the lexical co-occurrences of words, the proposed contextual topic model can determine the relevance of sentences more effectively, recognize latent topics, and arrange them hierarchically. The quantitative evaluation results from a practical application demonstrates that a system implementing this model can significantly improve the performance of summarization and make it comparable to state-of-the-art summarization systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call