Abstract

A system and method for compressing speech using an artificial neural network to calculate the recoded phase vector (Mozer code) resulting from the spectral magnitude-to-phase transformation. Raw speech is equalized to remove the spectral tilt and segmented into analysis frames. The spectral magnitudes of each frame segment are determined at a plurality of points by a Fourier Transform, normalized, and applied to a neural net magnitude-to-phase transform calculator to provide a recoded phase vector. An Inverse Discrete Fourier Transform is used to calculate the new recoded speech waveform in which the two quarters with minimum power are zeroed to produce the compressed speech output signal.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call