Abstract

News summarization is very important in the news analysis process. However, in the summarization process, there are often obstacles such as the large number of news stories and the need for news classification. This research aims to build a simple web-based system that can be used to summarize and classify news which will be very useful in the news analysis process. The proposed summarization method is Textrank, and the news classification method that will be used is KNN. This system is expected to provide an automatic summarization function to make it easier to analyze news content. The data that will be used as the basis for classification modeling is sports news in 3 months, and the classification that will be used to determine whether the news includes sports news in three branches, namely football, rackets or basketball. Testing of the summarization model using textrank was carried out by applying ROUGE-1 and ROUGE-2, with results of 0.79 and 0.67. Meanwhile, testing the classification model using KNN with k=3 and k=5 is 0.9866 and 0.9666 so k=3 will be used. This system will be built using the web scrapping library, textrank, stopword from PySastrawi, scikit-learn for the classification module using the KNN algorithm, and ngrok for publishing web-based applications. By using ngrok, we can expose the application through internet with a temporary public url without hosting required

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.