Abstract
News feed is one of the potential information providing sources which give updates on various topics of different domains. These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources. In this paper, the news summarization system is proposed for the news data streams from RSS feeds and Google news. Since news stream analysis requires live content, the news data are continuously collected for our experimentation. The major contributions of this work involve domain corpus based news collection, news content extraction, hierarchical clustering of the news and summarization of news. Many of the existing news summarization systems lack in providing dynamic content with domain wise representation. This is alleviated in our proposed system by tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation. Further, the news streams are summarized for the users with a novel summarization algorithm. The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically. The proposed system is compared with existing systems and achieves better results in generating news summaries. The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.
Highlights
Knowledge identification from online news articles have received keen attention among the news readers, especially from the Really Simple Syndication (RSS) feed-based news updates and Google news [1]
Our proposed system provides an improvement to the news summarization methods for news data streams and content retrieval is simplified with hierarchical news content clustering and user collaborative filtering
The news contents are summarized effectively based on the user given query by processing with the collaborative filtering method
Summary
Knowledge identification from online news articles have received keen attention among the news readers, especially from the Really Simple Syndication (RSS) feed-based news updates and Google news [1]. Though the keywords are tagged in the news content, it is important to organize the content in a hierarchical structure for retrieving the similar news content for summarizing to the users. A news clustering based summarization system is proposed to cluster various category of news content from multiple news sources and to generate news summaries on user interested topic. The extractive summary of the specific topic is generated from the clustered news contents. The evaluation results shown that the proposed system performs better in summarizing the news contents to the end users.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have