A cross-domain transfer learning model for author name disambiguation on heterogeneous graph with pretrained language model

Zhenyuan Huang,Hui Zhang,Chengqian Hao,Haijun Yang,Harris Wu

doi:10.1016/j.knosys.2024.112624

Zhenyuan Huang, Hui Zhang + Show 3 more

https://doi.org/10.1016/j.knosys.2024.112624

Copy DOI

Export

Save

Cite

Journal: Knowledge-Based Systems

Publication Date: Oct 18, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Author names in scientific literature are often ambiguous, complicating the accurate retrieval of academic information. Furthermore, many author names are shared by multiple scholars, making it challenging to construct academic search engine knowledge bases. These issues highlight the need for effective author name disambiguation. Existing methods have limitations in handling text content and heterogeneous graph node representations and often require extensive annotated training data. This study introduces an academic heterogeneous graph embedding neural network, HGNN-S, which leverages a pretrained semantic language model to integrate semantic information from texts, heterogeneous attribute relationships, and heterogeneous neighbor data. Trained on a small amount of single-domain annotated data, HGNN-S can disambiguate names across multiple domains. Experimental results demonstrate that our model outperforms current state-of-the-art methods and enhances search performance on the China National Platform, Kejso.

Full Text