Abstract

This paper presents a method to improve data integrity of individual-based bibliographic repository. Integrity improvement is done by comparing individual-based publication raw data with individual-based clustered publication data. Hierarchical Agglomerative Clustering is used to cluster the publication data with similar author names. Clustering is done by two steps of clustering. The first clustering is based on the co-author relationship and the second is by title similarity and year difference. The two-step hierarchical clustering technique for name disambiguation has been applied to Universitas Sriwijaya Publication Data Center with good accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call