Abstract

BackgroundPredicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools.ResultsIn this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation.ConclusionWe evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem.

Highlights

  • Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics

  • Previous studies on automated species identification have shown the benefit of using location information for improving identification results. They did not investigate the accuracy of ranked taxa recommendations retrieved directly from occurrence data. As such observation records are becoming increasingly available via online services, providing comprehensive sets of presence-absence as well as presence-only occurrence records, we argue that a systematic study is required that evaluates how spatio-temporal context information can be exploited to inform on-site plant species identification

  • Metrics reported throughout this section include average recall (R), average list length (LL), average list reduction (LR), mean reciprocal rank (MRR) and median rank (M) as defined in the previous section

Read more

Summary

Introduction

Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Accurate plant species identification represents the basis for all aspects of plant related research and is an important component of workflows in plant ecological research [1] Numerous activities, such as studying the biodiversity richness of a region, monitoring populations of endangered species, determining the impact of climate change on species distribution, and weed control actions depend on accurate identification skills. The German Biodiversity Exploratories project [10] studied sites spanning an area of 422 km to 1300 km and found that on grassland sites 318 to 365 vascular plant species occurred [11], while on forest sites merely 277 to 376 species were present [12] These figures represent less than 10% of the entire German flora. Taking a user’s current position in the field to estimate which species could possibly be encountered nearby can simplify identification tasks and is highly suitable given today’s prevalence of mobile devices with self-localization technology

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.