Abstract
An efficient and compact representation of the speech properties offers potential benefits in discriminating stuttering dysfluencies. The presence of dysfluencies in speech causes variation in vocal tract structure and articulatory moments. The Mel Frequency Cepstral Coefficients (MFCC) acoustic feature representing the vocal tract characteristics have been widely used in speaker and speech recognition applications. MFCC emphasizes on the short-term spectral properties of the signal and suppresses the essential information of temporal behavior. However, the combination of spectral and temporal information is essential for human perception of speech. In this paper, a new Linear Prediction-Hilbert transform based MFCC (LH-MFCC) feature extraction technique has been proposed to capture the temporal, instantaneous amplitude and frequency characteristics of speech. The proposed LH-MFCC enhances the perception nature of conventional MFCC and thereby improves its discrimination ability. The applicability of these vocal tract features is studied in the context of stuttering dysfluency classification. Comparative analyses of MFCC and LH-MFCC for three types of dysfluencies have been demonstrated. The LH-MFCC has shown improved performance in all dysfluency classification experiments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.