Abstract
In this study, a real-timelong-news summarization system is implemented based on the model. Owing to its characteristics, the KoBART model cannot summarize news with a token length of 1024 or more. Hence, we implemented a method of dividing long news into paragraphs, summarizing the divided paragraphs, and then resummarizing the summarized sentences. First, we evaluated the performance using an AI Hub dataset to validate the implemented two-stage summarization method. However, because the token length of most of the news provided in the AI Hub dataset is 1024 or less, we analyzed the performance for long news by applying the dataset provided by Hugging Face with a token length of 1024 or more. When summarizing long news with a token length of 1024 or more by dividing it into 512 paragraphs, the average Luge score is 33.99% and the runtime required for summarization is 0.8492 s. Therefore, we confirmed that the implemented long-news summarization system can provide real-time services, even for long news with a token length of 1024 or more.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have