Abstract

This pilot study of 110 scientific papers utilizing environmental sensor data from the Oklahoma Mesonet during its first two decades of operations demonstrates the diversity of potential purposes in scientific research for a robust, rigorously maintained, accessible source of environmental sensor data, as well as the challenges involved in identifying uses of that data within scientific papers. The study authors selected three publication years (1995, 2005, 2015) from an extensive corpus of peer-reviewed journal publications, identified each paper’s specific citation of and uses of the Mesonet’s environmental sensor data, and derived a typology of those usages (assimilation, experimentation, observation, simulation, utilization, validation) found to be most common. The rapid increase in data assimilation research projects today is discussed in terms of the difficulty and importance of correct attribution to individual data sources in these complex research projects. The study examines the possible role played by highly-cited papers that describe the quality assurance procedures in sensor data sources, which may serve as surrogates to signal the quality of the data provided by such sources, and which may also provide a useful contribution towards understanding data citation as a special form of scholarly citation.

Highlights

  • Data, both experimental and observational, have always played a central role in science, as Strasser (2012) points out, but recently the creation and curation of data, datasets, and datastreams have become of intense interest to information scientists as well as to a new generation of data scientists (Halevi & Moed 2012)

  • The findings from the pilot study of these 110 scientific publications, their citations of, and uses of the Oklahoma Mesonet sensor data are described in detail

  • 1995 Publications As shown in Table 1, there were 11 peer-reviewed journal articles published during the initial year of Oklahoma Mesonet operations

Read more

Summary

Introduction

Both experimental and observational, have always played a central role in science, as Strasser (2012) points out, but recently the creation and curation of data, datasets, and datastreams have become of intense interest to information scientists as well as to a new generation of data scientists (Halevi & Moed 2012). Acknowledging the use of externally-obtained data was once considered merely a ‘scholar’s courtesy’ (Cronin, 1995); today, such acknowledgment is often elicited by either disciplinary or publication pressures, though it remains far from universal, despite the increase in funding agency and institutional requirements for data attribution and sharing (Kim & Stanton 2016). Even such sophisticated approaches to identifying datasets as the development of personalized, contextualized dataset search engines (Singhal, Kasturi, & Srivastava 2014; Singhal & Srivastava 2013; Singhal & Srivastava 2017) are hampered by this lack of uniformity. There is wide variation among data, ranging from qualitative to quantitative data types, from potentially replicable experimental

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call