Unvoiced Components Research Articles

Recent studies in text-to-speech synthesis have shown the benefit of using a continuous pitch estimate; one that interpolates fundamental frequency (F0) even when voicing is not present. However, continuous F0 is still sensitive to additive noise in speech signals and suffers from short-term errors (when it changes rather quickly over time). To alleviate these issues, three adaptive techniques have been developed in this article for achieving a robust and accurate F0: (1) we weight the pitch estimates with state noise covariance using adaptive Kalman-filter framework, (2) we iteratively apply a time axis warping on the input frame signal, (3) we optimize all F0 candidates using an instantaneous-frequency-based approach. Additionally, the second goal of this study is to introduce an extension of a novel continuous-based speech synthesis system (i.e., in which all parameters are continuous). We propose adding a new excitation parameter named Harmonic-to-Noise Ratio (HNR) to the voiced and unvoiced components to indicate the degree of voicing in the excitation and to reduce the influence of buzziness caused by the vocoder. Results based on objective and perceptual tests demonstrate that the voice built with the proposed framework gives state-of-the-art speech synthesis performance while outperforming the previous baseline.

Read full abstract

Background/Objective: The objective of the present study is to classify a given speech signal by using energy as a differentiating parameter into voiced and unvoiced components due to the fact that the voiced components have a higher energy than their unvoiced counterparts. Method/Statistical Analysis: This is accomplished by dividing the speech signal into frames and by computing the short time energy of each frame. The recorded speech signal is segmented and then the energy component of these frames are obtained and then classified into voiced and unvoiced components. The current protocol involves 44 subjects, both males and females of no known vocal pathology. Predefined set of words, both in Kannada and English were recorded in a noise proof environment which was then separated into voiced and unvoiced components using MATLAB tool. Findings: The results proved a successful discrimination of the speech signal into voiced and unvoiced components based on the statistical parameters calculated for voiced as well as unvoiced components thereby providing a definite cue towards an automated approach to differentiate the speech into voiced and unvoiced components using statistical parameters. Application/Improvements: Such an approach can further be useful in various speech processing as well as speech recognition applications.

Read full abstract

Unvoiced Components Research Articles

Articles published on Unvoiced Components

Adaptive Refinements of Pitch Tracking and HNR Estimation within a Vocoder for Statistical Parametric Speech Synthesis

Classification of Sex based Speech Differentiation in Healthy Human Beings based on Voiced and Unvoiced Components

Segregation of voiced and unvoiced components from residual of speech signal

Improving signal quality of a speech codec using hybrid perceptual-parametric algorithm

Electrolaryngeal speech enhancement for telephony

Multi-feature speech/music discrimination system

Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation

Digital-formant synthesizer for speech-synthesis studies.

Digital Formant Synthesizer for Speech Synthesis Studies

Some New Methods for Digital Encoding of Voice Signals and for Voice Code Translation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Unvoiced Components Research Articles

Articles published on Unvoiced Components

Adaptive Refinements of Pitch Tracking and HNR Estimation within a Vocoder for Statistical Parametric Speech Synthesis

Classification of Sex based Speech Differentiation in Healthy Human Beings based on Voiced and Unvoiced Components

Segregation of voiced and unvoiced components from residual of speech signal

Improving signal quality of a speech codec using hybrid perceptual-parametric algorithm

Electrolaryngeal speech enhancement for telephony

Multi-feature speech/music discrimination system

Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation

Digital-formant synthesizer for speech-synthesis studies.

Digital Formant Synthesizer for Speech Synthesis Studies

Some New Methods for Digital Encoding of Voice Signals and for Voice Code Translation