Biodiversity data that meet the principles of Findability, Accessibility, Interoperability, and Reusability (FAIR) have tremendous potential to facilitate integrative science addressing large-scale biodiversity challenges. In order to increase the amount, breadth, and quality of FAIR biodiversity data, we must look to the individuals responsible for creating, sharing, and publishing their data. An open conversation and collaboration among those generating data and those using, extending, and curating data is critical (Leonelli 2016). Graduate students and other early career scientists often work between these points along the data pathway. As they enter the field of biodiversity science, they are required to have not only knowledge of taxa, ecosystems, and research practices but also a new set of skills to handle data of many types (Monfils et al. 2021). As best practices in biodiversity data science are actively defined and the need for FAIR data and Open Science continues to increase, involvement by early career scientists is most valuable, as they share their cutting-edge skills to integrate and publish biodiversity data. Early career scientists are often more likely and willing to publish the data they collect (Campbell et al. 2019) and can directly benefit through peer-reviewed publications and professional connections. However, safeguards and incentives are necessary, as data publication can carry a heavy burden and pose threats to emerging careers. Early career scientists may spend large amounts of time and effort to publish data without recognition, expose potential faults in their methodologies that would not otherwise be apparent, lose ownership of their data and derivatives, or have their data used unethically or inappropriately (Franz and Sterner 2018). Despite a demonstrated positive attitude among early career scientists toward Open Science and Open Data practices and values (Toribio-Flórez et al. 2021), the responsibility and risks of data sharing cannot be overlooked. In working to improve and expand the FAIR data landscape and support future biodiversity data science careers, there are several potential challenges to consider. First, there are often barriers to acquiring the knowledge and skills needed to utilize, aggregate, and publish data. Teaching of biodiversity data science has not been widely adopted in undergraduate curricula (Ellwood et al. 2019, Emery et al. 2021), so scientists must seek out educational materials on their own during their graduate or postgraduate careers. Integrating biodiversity data science into all stages of science careers is necessary; for example, by introducing incoming college students to the full range of roles in biodiversity science that include data, incorporating data skills early in undergraduate programs and emphasizing the importance and value of biodiversity data alongside established life science concepts, increasing opportunities to gain data expertise in graduate programs regardless of where students' career interest lies along the data continuum, and supporting established professionals who recognize the need for these skills and efforts without their own formal preparation and institutional resources (Barone et al. 2017). Training materials created for biodiversity data science professionals, including those offered by the Carpentries, Global Biodiversity Information Facility (GBIF), Biodiversity Information Standards (TDWG), and others can be translated to all these levels. Second, sharing FAIR data is not worth the effort or risk for many individuals. If graduate programs and future employers do not place a high value on biodiversity data science and do not train, fund, and promote early career scientists in this area, there is little incentive to pursue this expertise at all. Because early career scientists are often encouraged to publish data without clearly defined pathways or widely available resources, they are vulnerable to mistakes and criticism that could harm their careers. Finally, conversations and initiatives in biodiversity data science may not be accessible to early career scientists; appropriate on-ramps are needed to allow both novices and experts in this field to meaningfully participate. By taking advantage of these connections and opportunities, early career scientists can promote their work and advocate for their needs, and established professionals can gain the perspective necessary to successfully guide the early career workforce toward specific goals relative to FAIR data curation and publication. Here we will present our experience in each of these areas and share the ways in which we can continue to support and advance early career scientists in biodiversity data science.
Read full abstract