Ocean noise negatively influences the recording of odontocete echolocation clicks. In this study, a hybrid model based on the convolutional neural network (CNN) and long short-term memory (LSTM) network-called a hybrid CNN-LSTM model-was proposed to denoise echolocation clicks. To learn the model parameters, the echolocation clicks were partially corrupted by adding ocean noise, and the model was trained to recover the original echolocation clicks. It can be difficult to collect large numbers of echolocation clicks free of ambient sea noise for training networks. Data augmentation and transfer learning were employed to address this problem. Based on Gabor functions, simulated echolocation clicks were generated to pre-train the network models, and the parameters of the networks were then fine-tuned using odontocete echolocation clicks. Finally, the performance of the proposed model was evaluated using synthetic data. The experimental results demonstrated the effectiveness of the proposed model for denoising two typical echolocation clicks-namely, narrowband high-frequency and broadband echolocation clicks. The denoising performance of hybrid models with the different number of convolution and LSTM layers was evaluated. Consequently, hybrid models with one convolutional layer and multiple LSTM layers are recommended, which can be adopted for denoising both types of echolocation clicks.
Read full abstract