Abstract

Herbarium sheets present a unique view of the world's botanical history, evolution, and biodiversity. This makes them an all–important data source for botanical research. With the increased digitization of herbaria worldwide and advances in the domain of fine–grained visual classification which can facilitate automatic identification of herbarium specimen images, there are many opportunities for supporting and expanding research in this field. However, existing datasets are either too small, or not diverse enough, in terms of represented taxa, geographic distribution, and imaging protocols. Furthermore, aggregating datasets is difficult as taxa are recognized under a multitude of names and must be aligned to a common reference. We introduce the Herbarium 2021 Half–Earth dataset: the largest and most diverse dataset of herbarium specimen images, to date, for automatic taxon recognition. We also present the results of the Herbarium 2021 Half–Earth challenge, a competition that was part of the Eighth Workshop on Fine-Grained Visual Categorization (FGVC8) and hosted by Kaggle to encourage the development of models to automatically identify taxa from herbarium sheet images.

Highlights

  • Herbaria, like other natural history collections, are immense primary data repositories documenting biodiversity across space and time over the last 500 years (Stefanaki et al, 2019)

  • In this paper we introduce the Herbarium 2021 Half–Earth dataset, which aims to address the limitations aforementioned and is the largest and most diverse dataset of herbarium specimen images for automatic taxon recognition to date

  • We present the results from the challenge of the same name: the Herbarium 2021 Half–Earth challenge, a competition that was organized as part of the 8th workshop for Fine–Grained Visual Categorization at the Computer Vision and Pattern Recognition conference (CVPR) in 2021

Read more

Summary

Introduction

Like other natural history collections, are immense primary data repositories documenting biodiversity across space and time over the last 500 years (Stefanaki et al, 2019). Plants are essential to life on Earth, yet an estimated 37–44% of all vascular plant species are threatened with extinction (Nic Lughadha et al, 2020), underscoring the urgency to identify and classify the estimated 70,000 flowering plant species not yet described (Bebber et al, 2010; Joppa et al, 2011). Half of these new species are predicted to be already preserved in herbaria, awaiting an average of 35 years for detection and description from the date of first specimen collection (Bebber et al, 2010). Contributing to this delay is the dwindling number of taxonomists with broad plant identification skills to recognize new species, who are under ever increasing demands on their time and expertise (Secretariat of the Convention on Biological Diversity, 2007)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call