Abstract

ABSTRACT This paper proposes two novel approaches to measure the similarity of co-cited authors for the task of document clustering, a) paragraph-level content-based author co-citation analysis (PCACA) and b) section-level content-based author co-citation analysis (SCACA), by mining the textual cited reference at the paragraph and the section level within a given scientific publication, respectively. Using over 2000 full-text publications, indexed in the field of Computer and Information Sciences, indexed in PLOS.org, we extract the useful information from a full-text publication such as citing sentences, location of citing sentences, cited first author name, and title of cited. We show that our proposed SCACA method of clustering outperforms existing clustering methods by exhibiting more optimal clusters with minimum graph density and an average degree in SCACA, i.e., 0.327 and 27.143, respectively. Finally, we show that SCACA produces the optimum number of clusters that comprehensively explains sub-disciplines of co-cited author pairs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call