Abstract

Differences in fundamental frequencies (F0s) are an important cue for segregating multiple speakers. However, the ability to avail this cue for identification varies with sound levels. For different-and same-F0 conditions, the identification scores of both vowels increased from low- to mid-levels and then reduced at higher levels for younger adults with normal hearing (YNH). These subjects benefited from the F0 difference; however, this benefit varied across the levels. The current study aims to develop a deep-neural-network (DNN) model that can predict the level-dependent changes in concurrent-vowel scores for YNH subjects. This DNN-model includes two-dimensional convolutional neural networks (2D-CNNs) in parallel, with the same architecture, to predict the concurrent-vowel scores. The input layer was neural responses of the auditory-nerve model to concurrent vowels. The 2D-CNN models were trained and validated using the subsets of the concurrent vowels from 50, 65, and 75 dB SPL for both F0 conditions, using the batch gradient descent algorithm. The trained model was then fine-tuned with a single epoch. The 2D-CNN models were evaluated against vowel levels (25 to 85 dB SPL) and F0 conditions. Compared with previous models, the current model accurately predicts the level-dependent changes in concurrent vowel scores for different-and same-F0 conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call