Recent methodologies have achieved good performance in objectively summarizing important information from fact-based datasets such as XSUM and CNN/DM. These methodologies involve abstractive summarization, extracting the core content from an input text and transforming it into natural sentences. Unlike fact-based documents, opinion-based documents require a thorough analysis of sentiment and understanding of the writer’s intention. However, existing models do not explicitly consider these factors. Therefore, in this study, we propose a novel text summarization model that is specifically designed for opinion-based documents. Specifically, we identify the sentiment distribution of the entire document and train the summarization model to focus on major opinions that conform to the intended message while randomly masking minor opinions. Experimental results show that the proposed model outperforms existing summarization models in summarizing opinion-based documents, effectively capturing and highlighting the main opinions in the generated abstractive summaries.
Read full abstract