Abstract

In recent years, Online Social Networks (OSNs) have essentially become an integral part of our daily lives. There are hundreds of OSNs, each with its own focus and offers for particular services and functionalities. To take advantage of the full range of services and functionalities that OSNs offer, users often create several accounts on various OSNs using the same or different personal information. Retrieving all available data about an individual from several OSNs and merging it into one profile can be useful for many purposes. In this paper, we present a method for solving the Entity Resolution (ER), problem for matching user profiles across multiple OSNs. Our algorithm is able to match two user profiles from two different OSNs based on machine learning techniques, which uses features extracted from each one of the user profiles. Using supervised learning techniques and extracted features, we constructed different classifiers, which were then trained and used to rank the probability that two user profiles from two different OSNs belong to the same individual. These classifiers utilized 27 features of mainly three types: name based features (i.e., the Soundex value of two names), general user info based features (i.e., the cosine similarity between two user profiles), and social network topological based features (i.e., the number of mutual friends between two users' friends list). This experimental study uses real-life data collected from two popular OSNs, Facebook and Xing. The proposed algorithm was evaluated and its classification performance measured by AUC was 0.982 in identifying user profiles across two OSNs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.