Abstract

News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to read. These problems can be solved by grouping or so-called classification. The classification process is carried out of course by using a computerized process. This study aims to classify several news topics in Indonesian language using the KNN classification model and word2vec to convert words into vectors which aim to facilitate the classification process. The use of KNN in this study also determines the optimal K value to be used. In addition to using the classification model, this study also uses a word embedding-based model, namely word2vec. The results obtained using the word2vec and KNN models have an accuracy of 89.2% with a value of K=7. The word2vec and KNN models are also superior to the support vector machine, logistic regression, and random forest classification models.

Highlights

  • News is information disseminated by newspapers, radio, television, the internet, and other media

  • Hundreds of news articles are written every day on various online-based Indonesian news portals, due to the large number of news portals that switch to print media as electronic media that can be accessed online using the internet [2,3]

  • Research [3] raises the topic of how to form a classification of large Indonesian news data accurately using various computerized models such as Neural Network, Support Vector Machine (SVM), Naïve Bayes, and K-Nearest Neighbor (KNN)

Read more

Summary

Introduction

News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet This makes newsreaders have difficulty when they want to find the desired news topic to read. This study aims to classify several news topics in Indonesian language using the KNN classification model and word2vec to convert words into vectors which aim to facilitate the classification process. In paper [9] made a multilabel classification model on Indonesian news topics using the K-Nearest Neighbor (KNN) method. News readers will have difficulty in Research [11] This research will apply the Porter finding a news topic that they want to read These Stemmer Enhancement algorithm in the stemming problems can be solved by grouping or so-called process and the Likelihood method for news classification. Research [16] aims to and Support Vector Machine

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call