Spam community detection & influence minimization using NRIM algorithm

Sakshi Srivastava,Supriya Agrahari,Anil Kumar Singh

doi:10.1016/j.chb.2023.107832

Abstract

Big Data is a research area where many different disciplines work together. Social media has grown in popularity as a tool for disseminating and gathering information. However, the success of social media like Twitter, Facebook, etc., has not only attracted genuine users but also spammers who utilize social graphs, famous phrases, and hashtags to spread malware. This study uses several social network analysis and visualization methods based on bibliometric data from the Web of Science to look at the structure and patterns of interdisciplinary collaborations and the latest emerging overall practice. For a better understanding of spamming behaviors on Twitter, the Twitter data set is thoroughly analyzed, and categorized into Spam and Non-Spam classifications. Earlier studies confined their scope to investigating the most negatively influential spammers by blocking the most influential spammers. However, the cumulative impact of other spammers having low individual negative influence values but higher impact values was neglected. In this article, we develop an algorithm for detecting social spam using Node Rank-based Influence Minimization (NRIM), which integrates Node Rank with the impact value of spam. The proposed spam influence minimization model also identifies spam-influential users and aids in limiting the flow of spam tweets within the Twitter network. Additionally, a detection algorithm for influential communities has been proposed to limit the spread of spam content through influential communities on the Twitter network. The primary focus of this paper is to reduce the spam impact on Twitter data by identifying influential spammers using the Node_Rank-based Influence Minimization (NRIM) algorithm. To begin, the tweets are classified into spam and non-spam using a machine learning algorithm. Furthermore, the spam observed in the Graph is analyzed, and the Spammer is passed through the NRIM algorithm to find the influential Spammers. In addition to this, the negative impact of the Spammer is reduced on the Twitter graph, and its impact is analyzed on query processing executed on Graph. The technique used for the minimization of the Spammer’s negative effect on the graph reduces the query execution time by 12%.

Full Text