Abstract
In waveform interpolation (WI), a speech signal is reconstructed by the concatenation of infinitesimal segments of an evolving characteristic waveform, which is obtained by interpolation over time [W. B. Kleijn, IEEE Trans. Speech Audio Process. 1, 386–399 (1993)]. WI leads to efficient coding of voiced speech, but current implementations switch to CELP for nonperiodic signals. The WI paradigm is extended to provide an effective basis for the coding of voiced and unvoiced speech and background noise. Prototype waveforms are extracted every 2.5 ms. At this high rate the WI analysis–synthesis system results in transparent speech quality. Each prototype waveform is decomposed into a slowly evolving waveform (SEW), obtained by convolution in time with a 40-ms smoothing window, and a remainder, the rapidly evolving waveform (REW). Because of its low bandwidth, a low bit rate suffices for the SEW (additional processing lowers the bit rate further), while the REW requires only a rough statistical description (e.g., its phase spectrum can be randomized). The new paradigm facilitates efficient, robust speech coding in the range 2–8 kb/s.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.