Abstract

This research explores the application of TF-IDF (Term Frequency-Inverse Document Frequency) and K-Nearest Neighbor (K-NN) in constructing a clickbait detection system for Indonesian online news headlines. The TF-IDF method is employed to ascertain the significance of words in news headlines, utilizing a tokenization process to generate numeric representations. The TF-IDF matrix serves as features in the K-NN classification model, with k=1 determining the most similar class. Model evaluation yields outstanding results, achieving accuracy, precision, recall, and F1-Score all reaching 1.0. The confusion matrix unveils no misclassifications, affirming the model's adeptness in correctly classifying all samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call