Abstract
Online Social Networks (OSNs) are fast becoming an essential media for social interactions among its users. With the rapid growth of these OSNs, the malicious and illegal activities are also on the rise posing potential threats such as disruption of communication, influencing decision making process of the gullible users, unauthorized control of resources etc. Sybil accounts pose such kind of potential threats in the OSNs in addition to wireless ad-hoc networks. Twitter is an OSN which we have used in this research work to identify such Sybil accounts with the help of Machine Learning (ML). ML helps in building models which are capable of learning from the existing datasets so as to be able to then apply it to solve Real-time or futuristic problems. The supervised ML techniques train the model with the labelled data which can be used to predict the discrete values based on its learning. In this paper, a classification model is trained using two classifiers namely, Random Forest (RF) and Support Vector Machine (SVM). To make the classification model more effective, Univariate and Correlation Matrix with Heatmap, these two Feature Selection (FS) techniques have also been used. The ML model then used the selected features to identify Sybil accounts. This study also explores the effect of biasing of data of real accounts with that of illegitimate Sybil accounts during the process of classification and Feature Selection. The results of this study show that the RF outperforms SVM by shade as far as the accuracy of prediction models is concerned under the given experimental setup.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.