Passive acoustic monitoring is increasingly being used for studying marine mammals, leading to the accumulation of large acoustic datasets. Analyzing these datasets becomes impractical without automated detection and classification software. Detectors and classifiers based on deep neural networks have shown great potential, but their performance is often limited by the availability of sufficient quantities of annotated training samples, and their application restricted to the specific acoustic environment(s) from which their training data were collected. We address these limitations by employing transfer learning, a deep learning concept whereby knowledge from a source domain is transferred to a target domain. Specifically, we considered two different underwater acoustic environments as the source and target domains. The objective was to use a deep neural network that had been trained in one environment with abundant annotated training samples, and optimize its performance in the other environment where the annotated training samples were limited. Training and testing were conducted using three acoustic datasets containing North Atlantic right whale (Eubalaena glacialis) upcalls. Experiments show that adapting a trained model to the new environment led to a substantial improvement in recall from 70% to 85%, while maintaining a low false-positive rate of less than 5 per hour. The methodology is implemented as an open-source Python tool to facilitate the creation of more tailored deep learning-based acoustic detectors and classifiers for North Atlantic right whale vocalizations and other stereotyped marine mammal calls.
Read full abstract