Abstract

Unverified rumor detection recently received considerable academic attention due to the societal impact resulting from this potential misinformation. Previous work in this area mainly focused on textual features using a limited number of data sets and candidate algorithms, and completely disregarded model explainability. This study aims to come up with a more comprehensive social media rumor detection methodology. First, we investigate which machine or deep learning algorithm is best suited to classify tweets into rumors and non-rumors using both textual and structured features. Next, we interpret these rumor detection models with the LIME method and assess the quality of the explanations via fidelity and stability. To ensure the robustness of our methodology, it is benchmarked across the well-known PHEME data sets and two novel data sets, which are made publicly available. The results indicate that machine learners perform best on small data sets, while transformer architectures show the highest predictive accuracy for larger data sets. Unfortunately, these high accuracy transformer models are incompatible with LIME, which results in low fidelity. Moreover, our study shows that all LIME explanations are unstable across folds. Based on these results, we argue to evaluate explanation quality using fidelity and stability before explanation deployment. Our results further demonstrate that apparent model-agnostic explanations such as LIME do not seem to be completely model-agnostic and should be used with caution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call