Detection of vowel segments in noise with ImageNet neural network architectures

René Fabricius,Ondrej Šuch

doi:10.1016/j.trpro.2021.07.112

Detection of vowel segments in noise with ImageNet neural network architectures

René Fabricius, Ondrej Šuch

Open Access

https://doi.org/10.1016/j.trpro.2021.07.112

Copy DOI

Journal: Transportation research procedia	Publication Date: Jan 1, 2021
Citations: 1	License type: cc-by

Affiliation: University of Žilina

#Segments In Speech #Absent Speech + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this article we report on experiments on detection of vowel segments in speech with additive noise. Deep neural networks have become the key algorithm in the majority of modern machine learning solutions. We investigate the performance of four ImageNet convolutional neural network (CNN) architectures. Usage of image processing CNNs is enabled by transforming the speech segments into spectrograms before the classification takes place. We perform experiments on TIMIT speech dataset and noise from datasets MAVD and ESC-50. The accuracy of individual architectures did not vary significantly among architectures on the dataset with added noise. However, accuracy of various architectures did differ significantly when applied to noise with absent speech.

Full Text