Romanian Fake News Detection Using Machine Learning and Transformer-Based Approaches

Elisa Valentina Moisi,Bogdan Cornel Mihalca,Simina Maria Coman,Alexandrina Mirela Pater,Daniela Elena Popescu

doi:10.3390/app142411825

Abstract

Nowadays, the consequence of quick access to information has lead to the spread of fake news, which has a strong damaging impact on democracy, justice, and public trust. Thus, it is crucial to analyze and evaluate detection methods for fake news. This paper focuses on the detection of Romanian fake news. In this study, we made a comparative analysis of machine learning algorithms and Transformer-based models on Romanian fake news detection using three datasets—FakeRom, NEW, and both FakeRom + NEW. The NEW dataset was build using a scrapping algorithm applied on the Veridica platform. Our approach uses the following machine learning models for detection: Naive Bayes (NB), Logistic Regression (LR), and Support Vector Machine (SVM). We also used two Transformer-based models—BERT-based-multilingual-cased and RoBERTa-large. The performance of the models was evaluated using various metrics: accuracy, precision, recall, and F1 score. The results revealed that the BERT model trained on the NEW dataset consistently achieved the highest performance metrics across all test sets, with 96.5%. Also, Support Vector Machine trained on NEW was another top performer, reaching a very good accuracy of 94.6% on the combined test set.

Full Text