Abstract

The classification of hoax news or news with incorrect information is one of the text categorization applications.Like text-based categorization of machine applications in general, this system consists of pre-processing andexecution of classification models. In this study, experiments were conducted to select the best technique in each sub-process by using 1200 articles hoax and 600 articles no hoax collected manually. This research Triedexperimenting to determine the best preprocessing stages between stop removals and stemming and showing the results of the deception Tree algorithm achieving an accuracy of 100% concluded above naive byes more stable level of accuracy in the number of datasets used in all candidates. Information gain, TFIDF and GGA based on using Naive Byes algorithm, supporting Vector Machine and Decision Tree no significant percentage change occurred on all candidates. But after using GGA (Optimize Generation) feature selection there is an increase of accuracy level The results of a comparison of classification algorithms between Naive Byes, decision trees and Support Vector machines combined with the GGA feature selection method for classifying the best result is generated by the selection of GGA + Decision Tree feature on candidate 2 (Paslon2) 100% and in the selection of the Information Gain + Decision Tree Feature selection with the lowest accuracy Candidate 3 at 36.67%, but overall improvement of accuracy Occurred on all algorithm after using feature selection and Naive byes more stable level of accuracy in the number of datasets used in all candidates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call