Abstract
Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development. By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior. Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies. The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.
Highlights
Data repositories work to ensure data are sufficiently preserved, accessible and understandable and in the future
We propose several different means of measuring user demand that leverage the web analytics that most digital repositories already capture to some degree
The evaluative technique that we set forth stems from the library and information literature on user behavior in online environments
Summary
Data repositories work to ensure data are sufficiently preserved, accessible and understandable and in the future. A scientist’s substantive needs are influenced by her theoretical and conceptual framework, prior work and its gaps, and the ever-evolving dialogue in her discipline. It reflects the broader sociopolitical environment, including current events, that push researchers to seek new data to answer society’s pressing challenges. Web analytics is a type of user behavior data captured by examining the traces of information that come from human-computer interaction (Dumais et al, 2014). Despite the widespread use of Google Analytics among academic libraries to understand their patrons’ search needs, data archives have not typically used such information to define content demand
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.