BiCIKL (Biodiversity Community Integrated Knowledge Library) is a European Union (EU) Horizon 2020 project (2021–2024) building a new community of research infrastructures (RIs), researchers and other stakeholders, through improved access to interlinked, open and FAIR (Findable, Accessible, Interoperable, Reusable) biodiversity data along the biodiversity research cycle (specimens, sequences, taxon names, publications) (Penev et al. 2022). The project’s 14 partners developed or substantially improved 16 tools and services currently in process of onboarding to the European Open Science Cloud (EOSC), presented in the FAIR Data Place (FDP) of BiCIKL’s flagship product, the Biodiversity Knowledge Hub (BKH). The tools and data were used in Open Call projects, performed by research groups worldwide. A key achievement of BiCIKL is the establishment of several new bi-directional links between the participating RIs through shared and interoperable data standards and web services. The sustainability of the BiCIKL services and especially of the strong collaborative spirit developed through the project will be ensured by a membership agreement for the BKH maintenance and further development. The results of BiCIKL are diverse and tackle various aspects of the implementation of open science practices in biodiversity research. The project partners and external collaborators from the Open Call projects published more than 80 papers and conference abstracts (see the article collections in Penev et al. 2022a and Thessen et al. 2023), two policy briefs (Penev et al. 2024, Agosti et al. 2024), three Biodiversity Information Science (TDWG) symposia (2021, 2023, 2024), several videos and factsheets and other training materials, guidelines and best practice recommendations, and so on. In the special focus of BiCIKL was the extraction and liberation of data from the PDFs of several thousands of published biodiversity articles making it accessible and re-usable. The new BiCIKL community proved to be successful in both technological innovation and long-lasting spirit of collaboration between biodiversity and genomics researchers, data repositories, RIs, publishers and other stakeholders. Beyond BiCIKL, we envisage our work towards further integration and interoperability between data domains by embracing human-in-the-loop collaborations, enhanced by Artificial Intelligence (AI). The implementation of AI and Large Language Models (LLM) should be possible when considering an important condition: to understand the complexity of past, recent and future changes in biodiversity and natural environments the use of AI tools should be based on аdequately curated, semantically structured and interlinked biodiversity data. We see this radical new step as a concerted community effort towards building a “Biodiversity Supergraph” (Fig. 1), understood here as a two-component ecosystem consisting of: centrally orchestrated system of tools and services, and distributed sources of transformed, semantically enhanced FAIR Linked Open Data, supplied by the partnering RIs. centrally orchestrated system of tools and services, and distributed sources of transformed, semantically enhanced FAIR Linked Open Data, supplied by the partnering RIs. The “Biodiversity Supergraph'' will provide integration of the biodiversity data on a scale and operational level that has never been attempted before. It is key for the next decade, to enable a baseline of global, biodiversity-related information serving organisations, academia, industry and society.
Read full abstract