Abstract

Recent progress in using deep learning techniques to automate the analysis of complex image data is opening up exciting new avenues for research in biodiversity science. However, potential applications of machine learning methods in biodiversity research are often limited by the relative scarcity of data suitable for training machine learning models. Development of high-quality training data sets can be a surprisingly challenging task that can easily consume hundreds of person-hours of time. In this talk, we present the results of our recent work implementing and comparing several different methods for generating annotated, biodiversity-oriented image data for training machine learning models, including collaborative expert scoring, local volunteer image annotators with on-site training, and distributed, remote image annotation via citizen science platforms. We discuss error rates, among-annotator variance, and depth of coverage required to ensure highly reliable image annotations. We also discuss time considerations and efficiency of the various methods. Finally, we present new software, called ImageAnt (currently under development), that supports efficient, highly flexible image annotation workflows. ImageAnt was created primarily in response to the challenges we discovered in our own efforts to generate image-based training data for machine learning models. ImageAnt features a simple user interface and can be used to implement sophisticated, adaptive scripting of image annotation tasks.

Highlights

  • Recent progress in using deep learning techniques to automate the analysis of complex image data is opening up exciting new avenues for research in biodiversity science

  • Potential applications of machine learning methods in biodiversity research are often limited by the relative scarcity of data suitable for training machine learning models

  • We present the results of our recent work implementing and comparing several different methods for generating annotated, biodiversity-oriented image data for training machine learning models, including collaborative expert scoring, local volunteer image annotators with on-site training, and distributed, remote image annotation via citizen science platforms

Read more

Summary

Introduction

Recent progress in using deep learning techniques to automate the analysis of complex image data is opening up exciting new avenues for research in biodiversity science. New Software Tools, and Best Practices for Developing High-quality Training Data for Machine Learning-based Image Analysis in Biodiversity Research

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.