Abstract
Speech reconstruction is a key issue in speech coding. In this paper, we propose an extended least-squares estimate, inverse short-time Fourier transforms magnitude (LSE-ISTFTM) speech reconstruction algorithm for MFCC-based low bit-rate speech coding. The proposed extended LSE-ISTFTM algorithm initializes speech with a specific signal rather than white noise, reconstructs voiced and unvoiced frames separately. Pitch frequency and voicing class are estimated from magnitude spectrum, which is inversed from MFCC, with Gaussian Mixture Model (GMM). The voicing classification and pitch estimation results show that the error is lower than 1% and 5.62%, respectively. The speech reconstruction results demonstrate that the proposed extended LSE-ISTFTM algorithm is more stable and converges faster than the LSE-ISTFTM algorithm. The speech coding results also show that the proposed algorithm has higher speech quality than the classic algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.