Bot Detection on Social Networks Using Persistent Homology

Minh Nguyen,Mehmet Aktas,Esra Akbas

doi:10.3390/mca25030058

Abstract

The growth of social media in recent years has contributed to an ever-increasing network of user data in every aspect of life. This volume of generated data is becoming a vital asset for the growth of companies and organizations as a powerful tool to gain insights and make crucial decisions. However, data is not always reliable, since primarily, it can be manipulated and disseminated from unreliable sources. In the field of social network analysis, this problem can be tackled by implementing machine learning models that can learn to classify between humans and bots, which are mostly harmful computer programs exploited to shape public opinions and circulate false information on social media. In this paper, we propose a novel topological feature extraction method for bot detection on social networks. We first create weighted ego networks of each user. We then encode the higher-order topological features of ego networks using persistent homology. Finally, we use these extracted features to train a machine learning model and use that model to classify users as bot vs. human. Our experimental results suggest that using the higher-order topological features coming from persistent homology is promising in bot detection and more effective than using classical graph-theoretic structural features.

Highlights

Online social networks have been an excellent platform for exchanging ideas and sharing information
While the users’ activities and profiles on the social media platform such as tweet content, tweeting behavior, and account properties like external URL ratio are considered as behavioral features, social network topology such as degree, edge count, average betweenness centrality and brokerage are considered as structural features
We propose a novel method for bot detection in online social networks

Summary

Introduction

Online social networks have been an excellent platform for exchanging ideas and sharing information. Excessive utilization of social media may cause various types of illegal activities, such as spam, fake news, and rumor spreading, produced by abnormal users. Most of these activities have automated behavior patterns made by automated websites or apps, which are called bots. Two types of features are used for bot detection on social networks in literature: structural features and behavioral features. Graphs are structured data representing relationships between objects [10,11] They are formed by a set of vertices ( called nodes) and a set of edges that are connections between pairs of vertices.

Methods

Results

Conclusion