Abstract

This paper analyzes the concordance between bibliometrics and peer review. It draws evidence from the data of two experiments of the Italian governmental agency for research evaluation. The experiments were performed by the agency for validating the adoption in the Italian research assessment exercises of a dual system of evaluation, where some outputs were evaluated by bibliometrics and others by peer review. The two experiments were based on stratified random samples of journal articles. Each article was scored by bibliometrics and by peer review. The degree of concordance between the two evaluations is then computed. The correct setting of the experiments is defined by developing the design-based estimation of the Cohen's kappa coefficient and some testing procedures for assessing the homogeneity of missing proportions between strata. The results of both experiments show that for each research areas of science, technology, engineering and mathematics the degree of agreement between bibliometrics and peer review is-at most-weak at an individual article level. Thus, the outcome of the experiments does not validate the use of the dual system of evaluation in the Italian research assessments. More in general, the very weak concordance indicates that metrics should not replace peer review at the level of individual article. Hence, the use of the dual system in a research assessment might worsen the quality of information compared to the adoption of peer review only or bibliometrics only.

Highlights

  • Efficient implementation of a research assessment exercise is a common challenge for policy makers

  • Two models have prevailed [1]: a first model based on peer review, such as the British Research Excellence Framework (REF), and a second model based on bibliometric indicators, such as Australian Excellence in Research (ERA), for the years preceding 2018

  • The Italian governmental agency for research evaluation ANVUR conducted two experiments for assessing the degree of agreement between bibliometrics and peer review. They were based on stratified random samples of articles, which were classified by bibliometrics and by informed peer review

Read more

Summary

Introduction

Efficient implementation of a research assessment exercise is a common challenge for policy makers. Even if attention is limited to scientific quality or scientific impact, there is a trade-off between the quality of information produced by a research assessment and its costs. Two models have prevailed [1]: a first model based on peer review, such as the British Research Excellence Framework (REF), and a second model based on bibliometric indicators, such as Australian Excellence in Research (ERA), for the years preceding 2018. The first model is considered more costly than the second. In the discussion on the pros and cons of the two models, a central topic deals with the agreement between bibliometrics and peer review.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call