Hybrid deep learning model for YouTube spam comment detection

Muhammad Sam'An,Khrisna Imaddudin

doi:10.11591/ijece.v14i3.pp3313-3319

Muhammad Sam'An, Khrisna Imaddudin

Open Access

https://doi.org/10.11591/ijece.v14i3.pp3313-3319

Copy DOI

Abstract

Social media platforms, including YouTube and Facebook, allow users to interact through comments and videos. However, the openness of these platforms also makes them susceptible to spammers engaging in phishing, malware distribution, and advertisement dissemination. In response, our study introduces an innovative technique for detecting features indicative of spam within comments associated with shared videos. The initial phase involves data collection from the University of California, Irvine (UCI) machine learning repository and preprocessing using tokenization and lemmatization. Subsequently, a rigorous feature selection process is executed, and experiments are conducted with various proposed classification models. The performance evaluation demonstrates outstanding accuracy in identifying spam comments on YouTube: convolutional neural network with gated recurrent unit (CNN-GRU) at 95.92%, convolutional neural network with long short-term memory (CNN-LSTM) at 95.41%, convolutional neural network with bidirectional long short-term memory (CNN-biLSTM) at 96.43%, gated recurrent unit (GRU) at 95.41%, long short-term memory (LSTM) at 94.13%, and bidirectional long short-term memory (biLSTM) at 96.94% and convolutional neural network (CNN) at 94.64%. These results highlight the substantial contribution of our approach to spam detection and the fortification of online security.

Full Text