Abstract

PurposeThe purpose of this paper is to present data that begin to detail the deficiencies of log file analytics reporting methods that are commonly built into institutional repository (IR) platforms. The authors propose a new method for collecting and reporting IR item download metrics. This paper introduces a web service prototype that captures activity that current analytics methods are likely to either miss or over-report.Design/methodology/approachData were extracted from DSpace Solr logs of an IR and were cross-referenced with Google Analytics and Google Search Console data to directly compare Citable Content Downloads recorded by each method.FindingsThis study provides evidence that log file analytics data appear to grossly over-report due to traffic from robots that are difficult to identify and screen. The study also introduces a proof-of-concept prototype that makes the research method easily accessible to IR managers who seek accurate counts of Citable Content Downloads.Research limitations/implicationsThe method described in this paper does not account for direct access to Citable Content Downloads that originate outside Google Search properties.Originality/valueThis paper proposes that IR managers adopt a new reporting framework that classifies IR page views and download activity into three categories that communicate metrics about user activity related to the research process. It also proposes that IR managers rely on a hybrid of existing Google Services to improve reporting of Citable Content Downloads and offers a prototype web service where IR managers can test results for their repositories.

Highlights

  • Institutional repositories (IR) have been under development for over fifteen years and have collectively become a significant source of scholarly content

  • Using file download counts as a metric for scholarly value is crucial for IR assessment, but it is a surprisingly difficult metric to measure accurately due to the deficiencies of web analytics tools and due to overwhelming non-human traffic

  • The total IR activity from the four repositories that we can report with a high level of confidence and accuracy, was calculated by combining Google Analytics Page Views, Google Search Console Clicks, and Google Analytics Events

Read more

Summary

Introduction

Institutional repositories (IR) have been under development for over fifteen years and have collectively become a significant source of scholarly content. The value proposition that justifies the expense of building and maintaining open access IR is based largely on unrestricted access to their content, and on the ability of IR managers and library administrators to report impact to researchers and university administrators. Using file download counts as a metric for scholarly value is crucial for IR assessment, but it is a surprisingly difficult metric to measure accurately due to the deficiencies of web analytics tools and due to overwhelming non-human (robot) traffic. The scholarly information-gathering process includes a filtering approach, (Acharya, 2015) through which the researcher eventually arrives at citable scholarly content. Measurable human interaction with IR can be said to include page views or downloads of three categories: 1. Browse pages organized by author, title, community pages, statistics, etc

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.