Improving Speech Recognition Rate through Analysis Parameters

Deividas Eringis,Gintautas Tamulevičius

doi:10.2478/ecce-2014-0009

Deividas Eringis, Gintautas Tamulevičius

Open Access

https://doi.org/10.2478/ecce-2014-0009

Copy DOI

Abstract

Abstract Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these variations are not very rapid and the signal can be considered as stationary in short segments. It is presumed that in short-time magnitude spectrum the most distinct information of speech is contained. This is the main reason for speech signal analysis in frame-by-frame manner. The analyzed speech signal is segmented into overlapping segments (so-called frames) for this purpose. Segments of 15-25 ms with the overlap of 10-15 ms are used usually. In this paper we present results of our investigation of analysis window length and frame shift influence on speech recognition rate. We have analyzed three different cepstral analysis approaches for this purpose: mel frequency cepstral analysis (MFCC), linear prediction cepstral analysis (LPCC) and perceptual linear prediction cepstral analysis (PLPC). The highest speech recognition rate was obtained using 10 ms length analysis window with the frame shift varying from 7.5 to 10 ms (regardless of analysis type). The highest increase of recognition rate was 2.5 %.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electrical, Control and Communication Engineering	Publication Date: May 1, 2014
Citations: 38	License type: CC BY-NC-ND 3.0

R Discovery Prime

R Discovery Prime

Improving Speech Recognition Rate through Analysis Parameters

Abstract

Talk to us

Similar Papers

More From: Electrical, Control and Communication Engineering

Lead the way for us

Similar Papers

Speech Clarity Index (.PSI.): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy
Prakasith Kayasith ... Thanaruk Theeramunkong
IEICE Transactions on Information and Systems | VOL. E92-D
Prakasith Kayasith, et. al.Prakasith Kayasith ... Thanaruk Theeramunkong
01 Jan 2009
IEICE Transactions on Information and Systems | VOL. E92-D

Importance of the Dynamic Range of an Analysis Windowfunction for Phase-Only and Magnitude-Only Reconstruction of Speech
Kamil K Wojcicki ... Kuldip K Paliwal
-
Kamil K Wojcicki, et. al.Kamil K Wojcicki ... Kuldip K Paliwal
01 Apr 2007
01 Apr 2007

Feature extraction and analysis of speech signal based on fractional Fourier transform
Yueying Zhai
-
Yueying ZhaiYueying Zhai
29 Jan 2023
29 Jan 2023

Improved recognition rate of language identification system in noisy environment
Randheer Bagi ... Jainath Yadav
-
Randheer Bagi, et. al.Randheer Bagi ... Jainath Yadav
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Speech Recognition Rate through Analysis Parameters

Abstract

Talk to us

Similar Papers

More From: Electrical, Control and Communication Engineering