Abstract

To reduce the negative impacts of rumors on the real world, rumor detection on social networks has practical significance. Currently, the research on Chinese rumor detection is relatively comprehensive, but Cantonese rumors are much less investigated. As a main dialect of Chinese, Cantonese has more than 60 million speakers globally, but there are some great challenges in the Cantonese rumor detection. Firstly, there is no available benchmark dataset of Cantonese rumors. Secondly, it is a significant challenge to learn the unique linguistic characteristics of Cantonese. Thirdly, traditional rumor detection approaches cannot be directly applied to Cantonese rumors. Therefore, we propose a novel framework for Cantonese rumor detection using deep neural networks with feature fusion. To the best of our knowledge, it is the first study conducted on Cantonese rumor detection on social networks. Specifically, we build a Cantonese rumor dataset and a multi-domain Cantonese corpus. Next, a total of 27 statistical features are extracted and seven of them are newly proposed. Then, a novel deep learning model called BLA is designed to identify Cantonese rumors, which generates text and Jyutping embeddings using a further pre-trained BERT model and a CNN model. Moreover, the BLA model integrates the statistical and semantic features to implement the classification of Cantonese rumors. Experiments demonstrate that the BLA model achieves a remarkable Cantonese rumor detection performance with an F1 Score of 0.9225.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call