Abstract

Passive acoustic monitoring is a well-established tool for researching the occurrence, movements, and ecology of a wide variety of marine mammal species. Advances in hardware and data collection have exponentially increased the volumes of passive acoustic data collected, such that discoveries are now limited by the time required to analyze rather than collect the data. In order to address this limitation, we trained a deep convolutional neural network (CNN) to identify humpback whale song in over 187,000 h of acoustic data collected at 13 different monitoring sites in the North Pacific over a 14-year period. The model successfully detected 75 s audio segments containing humpback song with an average precision of 0.97 and average area under the receiver operating characteristic curve (AUC-ROC) of 0.992. The model output was used to analyze spatial and temporal patterns of humpback song, corroborating known seasonal patterns in the Hawaiian and Mariana Islands, including occurrence at remote monitoring sites beyond well-studied aggregations, as well as novel discovery of humpback whale song at Kingman Reef, at 5∘ North latitude. This study demonstrates the ability of a CNN trained on a small dataset to generalize well to a highly variable signal type across a diverse range of recording and noise conditions. We demonstrate the utility of active learning approaches for creating high-quality models in specialized domains where annotations are rare. These results validate the feasibility of applying deep learning models to identify highly variable signals across broad spatial and temporal scales, enabling new discoveries through combining large datasets with cutting edge tools.

Highlights

  • In the marine environment, where limited light transmission restricts visual cues, cetaceans utilize sound for every aspect of their day-to-day lives, with all species of whales and dolphins making some sort of vocalization

  • We present here the development of a deep machine learning convolutional neural network (CNN) model, using a process called active learning to increase the size and focus of the training data, to classify humpback whale vocalizations in an acoustic dataset of unprecedented temporal and spatial scale

  • Front End Experiments We found that per-channel energy normalization applied to the spectrograms (PCEN) outperformed both log and root compression, providing an increase in both average precision and area under the receiver operating characteristic curve (AUC-ROC)

Read more

Summary

Introduction

In the marine environment, where limited light transmission restricts visual cues, cetaceans utilize sound for every aspect of their day-to-day lives, with all species of whales and dolphins making some sort of vocalization. Many cetacean vocalizations are identifiable to the species or even population level, enabling use of passive acoustic recorders to examine species occurrence and seasonality (e.g., Clark et al, 2002; Širovicet al., 2003; Munger et al, 2008), population movements. New discoveries are often limited by the time it takes to analyze the data rather than by the data collection itself To address this challenge, scientists have been working to speed acoustic data analysis by automating cetacean call identification (Bittle and Duncan, 2013). Many species have highly variable call types that require significant manual input to correctly classify (Bittle and Duncan, 2013), and automated detection of these vocalizations still often requires either initial fine tuning of the detector or post-processing of the detections in order to remove a high number of incorrectly identified calls

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.