Abstract
Nowadays people are likely to reveal different aspects of life on different websites. Given a user's account information on one site, we can find his or her other accounts on other websites with the help of profile matching, which can benefit multiple application domains, including recommendation, privacy, security and so on. Traditional profile matching method makes use of as many attributes as possible and calculates the similarity of attribute values one by one, which is not fit for the realistic scenario with large-scale data and lack of common attributes. Therefore, in this paper, we proposed a new profile matching method. We extracted information about user identities on about.me, Twitter and Github; We used real name, username, location, and url to generate the feature of the user. We applied several similarity measures (e.g. Jaro Winkler similarity, Levenshtein distance and N-gram distance) to measure the similarity of the user profiles on different websites. Using the most promising similarity measure and parameters, we achieved a high reliability with a recall over 95% for a 98% precision, similar to the basic method and greatly reduced the running time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.