Abstract

BackgroundRecent years have brought great progress in efforts to digitize the world’s biodiversity data, but integrating data from many different providers, and across research domains, remains challenging. Semantic Web technologies have been widely recognized by biodiversity scientists for their potential to help solve this problem, yet these technologies have so far seen little use for biodiversity data. Such slow uptake has been due, in part, to the relative complexity of Semantic Web technologies along with a lack of domain-specific software tools to help non-experts publish their data to the Semantic Web.ResultsThe BiSciCol Triplifier is new software that greatly simplifies the process of converting biodiversity data in standard, tabular formats, such as Darwin Core-Archives, into Semantic Web-ready Resource Description Framework (RDF) representations. The Triplifier uses a vocabulary based on the popular Darwin Core standard, includes both Web-based and command-line interfaces, and is fully open-source software.ConclusionsUnlike most other RDF conversion tools, the Triplifier does not require detailed familiarity with core Semantic Web technologies, and it is tailored to a widely popular biodiversity data format and vocabulary standard. As a result, the Triplifier can often fully automate the conversion of biodiversity data to RDF, thereby making the Semantic Web much more accessible to biodiversity scientists who might otherwise have relatively little knowledge of Semantic Web technologies. Easy availability of biodiversity data as RDF will allow researchers to combine data from disparate sources and analyze them with powerful linked data querying tools. However, before software like the Triplifier, and Semantic Web technologies in general, can reach their full potential for biodiversity science, the biodiversity informatics community must address several critical challenges, such as the widespread failure to use robust, globally unique identifiers for biodiversity data.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2105-15-257) contains supplementary material, which is available to authorized users.

Highlights

  • Recent years have brought great progress in efforts to digitize the world’s biodiversity data, but integrating data from many different providers, and across research domains, remains challenging

  • A seventh category was added, dwc:MaterialSample [37], but it is not yet included in the Triplifier’s ontology. For each of these six classes, we considered their meanings as defined in the Darwin Core (DwC) standard as well as how they are most commonly used in practice in order to decide how instances of these classes could best be connected using the four simple properties discussed above

  • The BiSciCol Triplifier is available to users as both a Web application [42] and a command-line program

Read more

Summary

Introduction

Recent years have brought great progress in efforts to digitize the world’s biodiversity data, but integrating data from many different providers, and across research domains, remains challenging. Semantic Web technologies have been widely recognized by biodiversity scientists for their potential to help solve this problem, yet these technologies have so far seen little use for biodiversity data Such slow uptake has been due, in part, to the relative complexity of Semantic Web technologies along with a lack of domain-specific software tools to help non-experts publish their data to the Semantic Web. Biocollections represent irreplaceable legacy information about our biosphere that is essential for understanding how biodiversity is changing in an era of unprecedented human impacts [1,2,3]. Biocollections represent irreplaceable legacy information about our biosphere that is essential for understanding how biodiversity is changing in an era of unprecedented human impacts [1,2,3] Such analyses are only practical if data from biocollections around the world are digitized, integrated, and made widely available online. As each institution populates its own data island, the links between these objects are lost, and putting these pieces back together again is, at best, very challenging and at worst, practically impossible

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call