Abstract
AbstractFake news is information that does not represent reality but is commonly shared on the internet as if it were true, mainly because of its dramatic, appealing, and controversial content. Therefore, a relevant issue is to find characteristics that can assist in identifying Fake News, mainly nowadays, where an increasing number of fake news is spread all over the internet every day. This work aims to extract knowledge from Brazilian fake news data based on statistical learning. Initially, an exploratory data analysis is performed for the available variables to extract insights from the differences between fake and true news. Then, the prediction and modelling are carried out. The learning phase aims to build a model and measure the features that best explain the behaviour of misleading texts, which leads to a parsimonious model. Finally, the test phase estimates the fitted model accuracy based on 10‐fold cross‐validation in the Monte Carlo framework. The results show that four variables are significant to explain fake news. Moreover, our model achieved comparable results with state‐of‐the‐art, 0.941 F‐measure, for a single classifier while having the advantage of being a parsimonious model. This work's details and code can be found at https://github.com/limagbz/fake-news-ptBR.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.