Abstract

In the real world, it is inevitable that some people share a name. However, the ambiguity of the author’s name has brought many difficulties to the retrieval of academic works. Existing author name disambiguation works generally rely on the feature engineering or graph topology of the academic networks (e.g., the collaboration relationships). However, the features may be costly to obtain due to the availability or privacy of data. What’s more, the simple relational data cannot capture the rich semantics underlying the heterogeneous academic graphs. Therefore, in this paper, we study the problem of author name disambiguation in the setting of heterogeneous information network, and a novel network representation learning based author name disambiguation method is proposed. Firstly, we extract the heterogeneous information networks and meta-path channels based on the selected meta-paths. Secondly, two meta-path based proximities are proposed to measure the neighboring and structural similarities between nodes. Thirdly, the embeddings of various types of nodes are sampled and jointly updated according to the extracted meta-path channels. Finally, the disambiguation task is completed by employing an effective clustering method on the generated paper related vector space. Experimental results based on well-known Aminer dataset show that the proposed method can obtain better results compared to state-of-the-art author name disambiguation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call