Abstract

Many people who discuss sensitive or private issues on social media services are using pseudonyms or aliases in order to not reveal their true identity, while using their usual, non-private accounts when posting messages on less sensitive issues. Previous research has shown that if those individuals post large amounts of user-generated content, stylometric techniques can be used to identify the author based on the characteristics of the textual content. In this article we show how an author’s identity can be unmasked in a similar way using various time features (e.g., period of the day and the day of the week when a user’s posts have been published). We combine several different time features into a timeprint, which can be seen as a type of fingerprint when identifying users on social media. We use supervised machine learning (i.e., author identification) and unsupervised alias matching (similarity detection) in a number of different experiments with forum data to get an understanding of to what extent timeprints can be used for identifying users in social media, both in isolation and when combined with stylometric features. The obtained results show that timeprints indeed can be a very powerful tool for both author identification and alias matching in social media.

Highlights

  • An increasing amount of many people’s life is spent online

  • While much of the information is expressed publicly, there is more sensitive information available in web forums and other social media services that potentially could be harmful to the author if it became widely known who the physical person behind the user that is posting information is in reality

  • Conclusions and future work In this article, we have presented the idea that a user’s timeprint can be useful for identifying users who make use of multiple aliases

Read more

Summary

Introduction

An increasing amount of many people’s life is spent online. People are using Internet and social media in order to communicate, express their opinions and beliefs, discuss topics of interest to them, etc. There are many examples from the Security Informatics research community related to the analysis of terrorist activities on the Web (see e.g., [1,2,3]), such as the spreading of extremism propaganda and discussions on how to make improvised explosive devices In such settings, it can be of fundamental importance to intelligence analysts to find out what a person writes and who the physical person behind some pieces of texts really is. It can be of fundamental importance to intelligence analysts to find out what a person writes and who the physical person behind some pieces of texts really is It can be highly relevant for the police to find out who the author behind anonymous posts in a cybercrime investigation really is by linking the anonymized social media account to non-anonymized postings or accounts. Many people would like to be able to freely express their ideas and beliefs, while at the same time avoid revealing their true identity to e.g., friends, employers, commercial companies, police, or intelligence services

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call