Abstract

In wildlife monitoring, large amounts of video data are generated by recordings from camera traps. Training of deep learning methods demands for annotated video data, i.e. video data where each frame is annotated with the correct number and species designation of the observed animals. But manual annotation of video clips is extremely time-consuming and laborious. In this proof of concept we compare three different state-of-the-art approaches to the annotation of video data: Manual annotation using the VGG Image Annotator, interactive annotation using the MiVOS video annotator and automated annotation utilizing an adapted and customized Tracktor approach that propagates annotations from frame to frame through complete video clips. An experimental proof of concept on wildlife video clips captured by camera traps show extreme time savings from hours down to minutes (i.e. in order of a magnitude) thereby not only maintaining the detection scores of animals in each frame but also improving detection scores from 54.7% to 58.5% compared to the employment of perfect but costly manual annotations in training.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.