Climate change poses severe risks to the survival of many whale populations, whose habitats and migration patterns are affected by environmental changes. To detect these whales effectively, especially in deep-sea environments, we need to use AI-based techniques to handle the acoustic diversity and variability of different species. However, current methods for whale detection are based on pre- and post-processing steps that reduce their efficiency and generalizability. To address this issue, we present DeepWhaleNet, a novel deep-learning model that automates whale detection in Underwater Passive Acoustic Monitoring datasets. DeepWhaleNet simplifies the detection process by extracting relevant features from raw log-power spectrograms and helps protect these threatened species by supporting conservation efforts. Our model uses a larger short-time Fourier transform as input and a custom ResNet-18 architecture for classification, which enables it to separate whale sounds from noise and capture their temporal and spectral characteristics. We evaluate the performance of DeepWhaleNet and show that it surpasses state-of-the-art methods, achieving an 8.3% improvement in the F-1 score and 21% higher average precision of binary relevance than the baseline method. Moreover, our model demonstrates its versatility and suitability for species-specific retrieval problems through an ablation study on multi-label retrieval problems and a 99.1% recall for Blue Whales.
Read full abstract