Abstract

With the use of identity resolution, both information leakage and identity hacking can be reduced to some extent. In this paper, a prototype has been developed to classify Twitter users as suspicious and nonsuspicious on the basis of features which identify user demographics and their tweeting activity using Twitter APIs. A model has been devised based upon user and tweet meta-data which is used to calculate user score and tweet score, and further aggregate the values generated by these scores to label suspicious and nonsuspicious users in the collected dataset of around 21,492 Twitter users. Further, support vector machine classifier has been used to classify the labeled data. Through this paper, our analysis about the role of features and the characteristics of dataset used for the categorization of users in Twitter has been reported. The experimental results illustrate that the proposed system can identify suspicious users with an accuracy of 94.1%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call