Abstract

Deep-learning models have surpassed many computer-vision benchmarks and groups such as Google have begun to investigate similar methods for understanding underwater data. Understanding underwater soundscapes is critical to many applications such as assessing the impacts of anthropogenic noise on sea life and monitoring the health and biodiversity of the ocean. In this work, we present a computer-vision approach for classifying audio signals to distinguish whale sounds—especially, vocalizations from mysticetes—from other sources of sound in the underwater soundscape, such as ships. This is a challenging problem due to wide variation in ambient background noise, sensor configuration and properties, and whale vocalization patterns within and across species. Here, we adapt deep convolutional neural networks (CNN) to analyze spectral patterns of common noise sources and demonstrate robust performance on a dataset of ambient noise derived from multiple open-source databases including whale vocalizations from eight species and shipping noise from over ten platforms observed across multiple environments with a variety of sensors. Performance of the network is characterized in terms of classification accuracy and generalizability as a function of CNN hyperparameters and training architecture. With a CNN trained from scratch, we analyze the learned features of the classification decisions of the network by adapting several visualization techniques from the computer-vision domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call