Abstract

The principal-components statistical procedure for data reduction is used to efficiently encode speech power spectra by exploiting the correlations of power spectral amplitudes at various frequencies. Although this data-reduction procedure has been used in several previous studies, little attempt was made to optimize the methods for spectral selection and coding through the use of intelligibility testing. In the present study, principal-components basis vectors were computed from the continuous speech of several male and female speakers using various nonlinear spectral amplitude scales. Speech was synthesized using a combination linear predictive (LP) principal-components vocoder. Of the amplitude scales investigated for use with a principal-components analysis of speech spectra, logarithmic amplitude coding of non-normalized spectra emerged as a slight favorite. Speech synthesized from four principal components was found to be about 80% intelligible using a form of the Diagnostic Rhyme Test for rhyming word pairs and about 95% intelligible for words within a sentence context. Speech synthesized from spectral principal components compared favorably in intelligibility and quality with speech synthesized from a control LP vocoder with the same number of parameters.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.