Abstract

Social networks have taken an irreplaceable role in our lives. They are used daily by millions of people to communicate and inform themselves. This success has also led to a lot of irrelevant content and even misinformation on social media. In this paper, we propose a user-centred framework to reduce the amount of irrelevant content in social networks to support further stages of data mining processes. The system also helps in the reduction of misinformation in social networks, since it selects credible and reputable users. The system is based on the belief that if a user is credible then their content will be credible. Our proposal uses word embeddings in a first stage, to create a set of interesting users according to their expertise. After that, in a later stage, it employs social network metrics to further narrow down the relevant users according to their credibility in the network. To validate the framework, it has been tested with two real Big Data problems on Twitter. One related to COVID-19 tweets and the other to last United States elections on 3rd November. Both are problems in which finding relevant content may be difficult due to the large amount of data published during the last years. The proposed framework, called NOFACE, reduces the number of irrelevant users posting about the topic, taking only those that have a higher credibility, and thus giving interesting information about the selected topic. This entails a reduction of irrelevant information, mitigating therefore the presence of misinformation on a posterior data mining method application, improving the obtained results, as it is illustrated in the mentioned two topics using clustering, association rules and LDA techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call