Abstract

User identification across multiple online social networks is beneficial for building knowledge graphs. Under privacy protection considerations, researchers have shown increasing interest in user identification based on username similarity. However, existing solutions rely on manual features extracted by domain experts and do not exploit the deep semantic features of usernames. Moreover, existing solutions are limited to monolingual user names such as English or Chinese, ignoring other multilingual usernames. This paper proposes a multilingual pre-trained model-based username similarity method for user identification across multiple online social networks. First, we use many multilingual corpora to enable the model to learn more semantic information and extract deep semantic features of usernames. Then, fine-tuning is performed on our constructed dataset of multilingual usernames across multiple online social networks. Ultimately assess the similarity of user identities across multiple online social networks. Our method facilitates user identification with limited data. Finally, the efficiency of our model is verified on three constructed real-world multilingual username datasets across multiple online social networks and compared with existing state-of-the-art methods. Experimental results show that the proposed algorithm outperforms the compared algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.