Abstract
Anchor user identification across social networks is a classification task which determines whether a pair of accounts from different social networks belong to the same user. It is a fundamental research of information dissemination across social networks. Based on the observation that users prefer to use similar or identical display names in different social network, some researchers utilized the similarity between display names to build models. However, due to Chinese social network setting and pronunciation and font characteristics of Chinese display names, these methods do not perform well in Chinese social network datasets. To address this problem, we analyze the display name pairs of Chinese anchor users which are obtained by a crawler build in this paper. Then we define 4 special features to extract the pronunciation and font similarities. Finally, we use Gradient Boosting to establish the identification model. The experiments based on the ground-truth datasets we obtained show that these features can improve the performance of display name-based anchor user identification between Chinese social networks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.