Speech rate estimation using representations learned from speech with convolutional neural network

Renuka Mannem,Aravind Illa,Prasanta Kumar Ghosh,Hima Jyothi

doi:10.1109/spcom50965.2020.9179502

Abstract

With advancement in machine learning techniques, several speech related applications deploy end-to-end models to learn relevant features from the raw speech signal. In this work, we focus on the speech rate estimation task using an end-to-end model to learn representation from raw speech in a data driven manner. We propose an end-to-end model that comprises of 1-d convolutional layer to extract representations from raw speech and a convolutional dense neural network (CDNN) to predict speech rate from these representations. The primary aim of the work is to understand the nature of representations learned by end-to-end model for the speech rate estimation task. Experiments are performed using TIMIT corpus, in seen and unseen subject conditions. Experimental results reveal that, the frequency response of the learned 1-d CNN filters are low-pass in nature, and center frequencies of majority of the filters lie below 1000Hz. While comparing the performance of the proposed end-to-end system with the baseline MFCC based approach, we find that the performance of the learned features with CNN are on par with MFCC.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech rate estimation using representations learned from speech with convolutional neural network

Abstract

Talk to us

Similar Papers

More From: Control theory & applications

Lead the way for us

Journal: Control theory & applications	Publication Date: Jul 1, 2020
Citations: 22

Similar Papers

Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
Dimitri Palaz ... Mathew Magimai-Doss
-
Dimitri Palaz, et. al.Dimitri Palaz ... Mathew Magimai-Doss
25 Aug 2013
25 Aug 2013

The stochastic aeroelastic response analysis of helicopter rotors using deep and shallow machine learning
Tanmoy Chatterjee ... Michael I Friswell
Neural Computing and Applications | VOL. 33
Tanmoy Chatterjee, et. al.Tanmoy Chatterjee ... Michael I Friswell
17 Jul 2021
Neural Computing and Applications | VOL. 33

Daily scale streamflow forecasting in multiple stream orders of Cauvery River, India: Application of advanced ensemble and deep learning models
Sujay Raghavendra Naganna ... Zaher Mundher Yaseen
Journal of Hydrology | VOL. 626
Sujay Raghavendra Naganna, et. al.Sujay Raghavendra Naganna ... Zaher Mundher Yaseen
15 Oct 2023
Journal of Hydrology | VOL. 626

End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition
Dimitri Palaz ... Ronan Collobert
Speech Communication | VOL. 108
Dimitri Palaz, et. al.Dimitri Palaz ... Ronan Collobert
30 Jan 2019
Speech Communication | VOL. 108

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech rate estimation using representations learned from speech with convolutional neural network

Abstract

Talk to us

Similar Papers

More From: Control theory & applications