Trending Topic Classification for Single-Label Using Multinomial Naive Bayes (MNB) and Multi-Label Using K-Nearest Neighbors (KNN)

Denis Eka Cahyani,Kartini Aprilia Pratiwi Nuzry

doi:10.1109/icitisee48480.2019.9003944

Abstract

Trending Topic is one of the features found of Twitter in a short text. However, the short text used as a trending topic on Twitter sometimes confuses its users, so they need to be classified into several labels, but one tweet can have more than one label called multilable. The lables are politics, sports, entertainment, tourism, business, and other news. Another problem is the multi-labeling of classifications. Single-label will classify a trending topic into one label, while multi-label classifies into more than one label. This paper aimed to classify Twitter's trending topic using Multinomial Naive Bayes (MNB) for single-label data and K-Nearest Neighbor (KNN) for multi-label data. The steps were to collect trending topic data along with their tweets, labeling and text preprocessing, weighting TF-IDF, single-label classification using MNB and multi-label classification using KNN with the Binary Relevance approach, finally evaluation and analysis of results. By using K=3, the results show that KNN have 88.05% accuracy for multi-label data, whilst, MNB has a good result for single-label data 82.53% accuracy.

Full Text