Abstract

Publishers must be part of the effort to embed FAIR understanding within the research communities they serve Increasingly, researchers and their institutions see value in preserving the resulting data generated by original studies for wider distribution—enabling future time and cost savings, and new routes to ensure quality outcomes by reproducing previous research. Data sharing is a growing demand from researchers in many fields of study and a growing body of data-intensive research methods. Plainly put, the growing pleas for wider dissemination of research data is an open science principle that publishers can no longer ignore. From chemistry to history, some scholars are motivated to share and reuse research data to derive new knowledge from existing assets and to enable greater transparency (where possible) in scholarly inquiry. There are instances that have proven how data reuse can accelerate new discoveries, save precious time and money and increase cross-disciplinary advancements. Stakeholders in the scholarly communications lifecycle are therefore under pressure to integrate access to the original data used in the research reported in journal articles and other scholarly publications. Not surprisingly, the global COVID-19 pandemic has acted as a living demonstration of researchers' thirst for access to the datasets that underpin journal articles, books, and other published literature (CNI Executive Roundtable, 2021; Shankar et al., 2021). In fact, the speedy distribution of research data has been credited with the rapid development of vaccines and treatments. This global health crisis also spotlights the tremendous potential value of research data preservation, curation and distribution, but also underscores the tremendous work required to fully realize that value (Staunton et al., 2021). Simply making data freely available does not make it accurate or actionable. Achieving full benefits of data sharing and reuse requires systematic data management, storage, quality assurance and shared protocols implemented across the value chain. Researchers are often reticent to make their data more widely available, citing privacy or intellectual property concerns, or hitting roadblocks with idiosyncratic formatting (Stuart et al., 2018). This is why some experts have suggested we apply records governance principles to research data and leverage archival standards (such as ISO 15489), to normalize formats and file types. Putting this data infrastructure to work would also mean investing in operational data literacy among scholars in all fields of study (Wilkinson et al., 2016). Surveys show that data reuse practices vary by disciplinary norms, as well as by logistical factors, such as the availability of data repositories, software compatibility and the required effort or skills to put them to use (Digital Science et al., 2021). Regional policies and accountability vary greatly, from China's national directive to the fragmented practices in the United States (Bryant et al., 2021; Huang et al., 2021). Scholars' inclinations to reuse data from prior studies are also influenced by individual factors, including ‘perceived usefulness, perceived concern and the availability of internal resources’ (Kim & Yoon, 2017a, p. 2709). The challenges are similar for researchers in the social sciences as well as natural sciences, technical and medical fields (Kim & Yoon, 2017b; Melero & Navarro-Molina, 2020). Bringing original research datasets out of the shadows of individual computer files and institutional servers is a challenge for scholars and their universities, funders and publishers to address together. By each member of the scholarly communications value chain adopting the Findable, Accessible, Interoperable, and Reusable (FAIR) data principles, we have a chance of realizing the potential value of openly available research data. Underscoring the importance of adopting FAIR practices, Eefke Smit and Joris van Rossum reflect in this issue on the successes achieved during STM's Research Data Year. The STM program is entering its third year of supporting publishers in their enactment of data sharing policies, data availability statements and Scholix implementation for linking articles to associated datasets. To make data FAIR, some experts point to the need to contextualize datasets alongside accompanying resources and outputs, such as data management plans, articles and presentations. Using persistent identifiers (PIDs) throughout the research lifecycle is a leading method for connecting openly available data to its associated published materials, discussed by leading metadata champions Helena Cousijn, Ted Habermann, Elizabeth Krznarich and Alice Meadows. In other contexts, use of PIDs throughout the research lifecycle has proven to save researchers administrative time and resources, therefore saving money for their institutions. Saving time and money is a key benefit of data sharing, especially from the funder and institutional perspectives. In their survey of studies that produce unexpected or unsuccessful results, Marie-Emilia Herbet and her team demonstrate the knowledge lost when these ‘negative’ outcomes are unreported or the data are suppressed. Aki MacFarlane of the Wellcome Trust addresses publishers directly in his column, advocating for increased efforts aimed at supporting new research practices, to encourage the reusability and distribution, and thereby the value, of data within key research communities. The importance of accessible data repositories that are fit for purpose is a theme throughout this issue, underscored by case studies, such as the Science Data Bank in China, as presented by Chengzan Li and the team. Reflecting similar challenges in the United Kingdom, Catriona Manville and Grace Melvin champion open data publication and protocols in medical studies, such as clinical trials. Without improved data management and distribution, they argue there is a great deal of time and resource wastage in health care research. What we see in this issue is that data curation and distribution is hard enough; the reuse of that data is perhaps the bigger challenge. Qingyu Duan, Xiaoguang Wang and Ningyuan Song make the case for practical approaches to making data management and new, reuse-oriented practices more researcher friendly. Their article outlines 21 specific measures that can be leveraged by publishers, funders and research institutions. Clear expectations and journal data policies are among those measures encouraged by our authors, such as Yu Wang et al., who provide an overview of research data policies in Chinese journals. The impact of FAIR data policies is demonstrated by a case study of earth and environmental studies journals, as reported by Matthew Cannon and the Taylor & Francis team. Through the diverse perspectives of this issue's contributors, several opportunities are illuminated for publishers to support open data sharing. For example, to fulfil the promises of data sharing and effective reuse, publishers must be part of efforts to embed an understanding of FAIR principles more successfully within the research communities they serve. Publishers may find new business opportunities in addressing the current limitations in data literacy, geographical disparity and software interoperability. Research data management practices are far from standardized and the implications for research participants have not been uniformly addressed. “Across the board, we are seeing an increased responsibility for data stewardship by publishers, institutions, funders and research data repository providers”. In this issue, we aimed to grapple with the dominant issues that affect publishers and our partners; however, we cannot claim this issue as entirely comprehensive. For example, contributions to this themed issue did not address uneven access to research data, inequities in data literacy or the risks of ingraining datasets with cultural injustices or biased beliefs. Given the national and institutional policies, articles in this issue focus heavily on Chinese, European and American research teams, policies, and repositories; so, there are gaps in our global view of this topic. Several articles in this issue suggest there is much more to be done regarding the upstream implications of wide-spread data collection and sharing, and the practical requirements for standardized formatting or archival practices. Therefore, we leave these as suggestions for our ongoing work together in the interest of accelerating scholarly knowledge and discovery. Lettie Y. Conrad Richard Delahunty Wendy Ding

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call