Abstract

It has never been easier to get access to content. Rather, we are facing an ever increasing overload which undermines the ability to identify high-quality content relevant to the user. Automatic summarization techniques have been developed to distil down the content to the key points and shorten therewith the time required to grasp the essence and judge about the relevance of the document. Summarization is not a deterministic task and depends very much on the writing style of the person creating the summary. In this work we present a method to, given a set of human-created summaries for a corpus, establishes which automatic extractive summarization technique preserves best the style of the human summary writer. To prove our approach, we use a corpus of 1000 articles by Science Daily with the corresponding human-written summaries and benchmark 3 extractive summarization techniques (BERT-based, keyword-scoring-based and a Luhn summarizer), indicating the best style-preserving method and discussing the results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call