Abstract

In this era, a rapid thriving Internet occasionally complicates users to retrieve news category furthermore if there are plentiful of news to be categorized. News categorization is a technique can be used to retrieve a category of news which gives easiness for users. Internet has vast amounts of information especially at news. Therefore, accurate and speedy access is becoming ever more difficult. This paper compares a news categorization using k -Nearest Neighbor, Naive Bayes and Support Vector Machine. Using vary of variables and through a several steps of preprocessing which proving k-Nearest Neighbor is producing a capable accuracy competes with Support Vector Machine whereas Naive Bayes producing just an average result, not as good as k -Nearest Neighbor and Support Vector Machine yet as bad as k -Nearest Neighbor and Support Vector Machine ever reach. As the results, k -Nearest Neighbor using correlation measurement type produces the best result of this experiment.

Highlights

  • Along with the currently issue about big data and the rapid development of internet, information retrieval and text mining has become a popular research field in the world

  • The first text classification research has been held in 1960 [3], the other expansion of text classification has been done in many area like text information retrieval, electronic meetings, and text filtering [1]

  • This research conducts an experiment to compares three most popular methods of text classification: k-Nearest Neighbor, Naïve Bayes, and Support Vector Machine in News Categorization to classify the category of news in English using several pre-processing texts that will be explained in II

Read more

Summary

Introduction

Along with the currently issue about big data and the rapid development of internet, information retrieval and text mining has become a popular research field in the world. Study on text classification abroad dated back to the late 1957, [2] done some research works and propose a text classification using word frequency method. The first text classification research has been held in 1960 [3], the other expansion of text classification has been done in many area like text information retrieval, electronic meetings, and text filtering [1]. This research conducts an experiment to compares three most popular methods of text classification: k-Nearest Neighbor, Naïve Bayes, and Support Vector Machine in News Categorization to classify the category of news in English using several pre-processing texts that will be explained in II. The main purpose of this research is to find a best method to categorize a news so that the text can be categorized without thoroughly read a full text

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call