Abstract

Automatic individual recognition of wild Crested Ibis is challenged by two factors: the lack of labeled data and an unpredictable number of individuals. In this paper, we propose a hybrid method of self-supervised learning and clustering to automatically recognize wild Crested Ibis based on vocalizations. To address the first challenge, we enhance the Bootstrap Your Own Latent for Audio (BYOL-A) model by using an improved augmentation module and Spatial Group-wise Enhance (SGE) attention module to create the self-supervised learning model BYOL-AIS. This model aims to extract a more discriminative representation of Crested Ibis vocalizations. To handle the second challenge, we introduce a clustering method that combines Uniform Manifold Approximation and Projection (UMAP) and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) to distinguish the representation of Crested Ibis vocalizations. We evaluate our proposed method using vocalizations collected from 10 Crested Ibis individuals and achieve a recognition accuracy of 0.864. This accuracy is comparable to the performance of commonly used supervised methods. Our results suggest that our proposed method is a feasible method for wild bird recognition in the absence of labeled data and has the potential to be an analytical tool for processing huge amounts of monitoring data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call