Comparative analysis of various feature extraction techniques for classification of speech disfluencies

Nitin Mohan Sharma,Vikas Kumar,Prasant Kumar Mahapatra,Vaibhav Gandhi

doi:10.1016/j.specom.2023.04.003

Abstract

Speech plays a vital role in communication, from expressing oneself, to utilizing speech-based platforms, speech is a necessity. Any disruption in speech is referred to as disfluency, and can impact one's quality of life. This paper presents an experimental study on various techniques for the detection and classification of speech disfluencies. Six different types of disfluencies are examined in this paper, namely Interjection, Sound Repetition, Word Repetition, Phrase Repetition, Revision and Prolongation (6 classes). However, this paper also goes a step further by including the clean speech signals as an added class alongside the six disfluencies, thereby making this work more robust with 7 classes. Various machine learning approaches have been investigated on the University College London Archive of Stuttered Speech (UCLASS) dataset; a standard disfluency dataset generated by University College London (UCL). Five different feature extraction techniques viz. Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Cepstral Coefficients (LPCC), Gammatone Frequency Cepstral Coefficients (GFCC), Mel-filterbank energy features, and Spectrograms have been used. Comparative analysis of various classifiers shows that MFCC, GFCC, and Spectrograms achieved greater than 90% accuracy on both 6 and 7 classes with the kNN classifier. As a future scope to this study, the authors aim to focus on tackling the challenges of detecting multiple disfluencies present simultaneously in a speech sample.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative analysis of various feature extraction techniques for classification of speech disfluencies

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Apr 23, 2023
Citations: 2

Similar Papers

Optimizing Integrated Features for Hindi Automatic Speech Recognition System
Mohit Dua ... Mantosh Biswas
Journal of Intelligent Systems | VOL. 29
Mohit Dua, et. al.Mohit Dua ... Mantosh Biswas
01 Oct 2018
Journal of Intelligent Systems | VOL. 29

Speaker identification: A way to reduce call-sign confusion events
Sara Sekkate ... Mohammed Khalil
-
Sara Sekkate, et. al.Sara Sekkate ... Mohammed Khalil
01 May 2017
01 May 2017

Performance Analysis of various Front-end and Back End Amalgamations for Noise-robust DNN-based ASR
Mohit Dua ... Vinam Agrawal
Recent Advances in Computer Science and Communications | VOL. 14
Mohit Dua, et. al.Mohit Dua ... Vinam Agrawal
01 Dec 2021
Recent Advances in Computer Science and Communications | VOL. 14

Significance of acoustic features for designing an emotion classification system
Sudhakar Kumar ... Rabul Hussain Laskar
-
Sudhakar Kumar, et. al.Sudhakar Kumar ... Rabul Hussain Laskar
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative analysis of various feature extraction techniques for classification of speech disfluencies

Abstract

Talk to us

Similar Papers

More From: Speech Communication