Abstract

Citations are very important parameters and are used to take many important decisions like ranking of researchers, institutions, countries, and to measure the relationship between research papers. All of these require accurate counting of citations and their occurrence (in-text citation counts) within the citing papers. Citation anchors refer to the citation made within the full text of the citing paper for example: ‘[1]’, ‘(Afzal et al, 2015)’, ‘[Afzal, 2015]’ etc. Identification of citation-anchors from the plain-text is a very challenging task due to the various styles and formats of citations. Recently, Shahid et al. highlighted some of the problems such as commonality in content, wrong allotment, mathematical ambiguities, and string variations etc in automatically identifying the in-text citation frequencies. The paper proposes an algorithm, CAD, for identification of citation-anchors and its in-text citation frequency based on different rules. For a comprehensive analysis, the dataset of research papers is prepared: on both Journal of Universal Computer Science (J.UCS) and (2) CiteSeer digital libraries. In experimental study, we conducted two experiments. In the first experiment, the proposed approach is compared with state-of-the-art technique over both datasets. The J.UCS dataset consists of 1200 research papers with 16,000 citation strings or references while the CiteSeer dataset consists of 52 research papers with 1850 references. The total dataset size becomes 1252 citing documents and 17,850 references. The experiments showed that CAD algorithm improved F-score by 44% and 37% respectively on both J.UCS and CiteSeer dataset over the contemporary technique (Shahid et al. in Int J Arab Inf Technol 12:481–488, 2014). The average score is 41% on both datasets. In the second experiment, the proposed approach is further analyzed against the existing state-of-the-art tools: CERMINE and GROBID. According to our results, the proposed approach is best performing with F1 of 0.99, followed by GROBID (F1 0.89) and CERMINE (F1 0.82).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.