Abstract
Reliably detecting and tracking animals in wildlife videos is an essential basis for researchers to be able to analyse animal behavior or recognize animal individuals. In order to correctly distinguish individual animals that are standing close to each other, bounding boxes around the animals are not sufficient. Instead, an exact contour of the animal, an instance mask, which is the result of an instance segmentation, is needed. In this paper, we present SWIFT, a novel multi-object tracking and segmentation (MOTS) pipeline that solves this task. We evaluate the functionality of our approach on a self-created wildlife video dataset containing red deer and fallow deer. Our dataset is one of the very few datasets in wildlife monitoring that is annotated with instance masks and tracking IDs. SWIFT significantly improves the quality of the instance masks compared to using a state-of-the-art instance segmentation approach from 0.432 average precision to 0.495 average precision. Our tracking algorithm uses multiple filtering steps to either delete tracks that are found incorrectly or to merge tracks that are not yet connected. This results in an increased multi-object tracking accuracy score in comparison to a state-of-the-art tracking approach from 57.2% to 63.8%, which means that our detected tracking results are less erroneous.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.