Improved Signal/Pause Segmentation Algorithm Based on the Probability Density Function of Background Noise and Empirical Mode Decomposition

Alan K Alimuradov,Pyotr P Churakov,Alexander Yu Tychkov

doi:10.1109/elconrus51938.2021.9396561

Alan K Alimuradov, Pyotr P Churakov + Show 1 more

https://doi.org/10.1109/elconrus51938.2021.9396561

Copy DOI

Export

Save

Cite

Publication Date: Jan 26, 2021

Citations: 1

Affiliation: Penza State University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Segmentation into informative regions is an important stage in pre-processing of speech. The quality of segmentation affects the performance of almost all known applications of speech technologies (speech recognition, speaker identification, speech-to-text conversion, etc.). The article presents an improved speech/pause segmentation algorithm. The original algorithm is based on the use of probability density function of background noise, and the analysis of one-dimensional Mahalanobis distance of discrete timing for the investigated speech signal. Modernization consists in the fragmentation of speech and the decomposition of fragments into empirical modes for subsequent analysis of one-dimensional Mahalanobis distance of discrete timing for each mode separately. A study of the modernized algorithm has been carried out in comparison with the original algorithm and the well-known segmentation methods based on the analysis of zero-crossing rate and short-time energy. In accordance with the obtained results of the study, it was concluded that the improved segmentation algorithm provides the best detection of the boundaries of the beginning and the end of informative speech sections with the first and second kind errors, being 4.5767 % and 1.421 %, respectively.

Full Text