Abstract

Hate speech is a form of communication which contains hatred by doing things, such as inciting, insulting, disparaging, or demeaning a person or group. Hate speech issues in Indonesia often have linkages to politics. In 2018 and 2019, for example, the hate speech relates to the local leader and presidential elections. The hate speech actors commonly use social networks, such as Instagram, to spread their hatred words. About 60% of hate speech is found in the comments of the posts and it will be a real threat if not quickly detected. Our study aims to detect hate speech in Instagram comments. We propose the use of a word2vec method with skip-gram models and a modified TextCNN to learn and detect hate speech texts. Furthermore, random oversampling, random under sampling, and class weight was used to solve imbalanced dataset problems. The results show that the best accuracy, in term of F-score, is 93.70%, gained from a combination of word2vec skip-gram with window size 15, a modified TextCNN, and random oversampling methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call