Abstract
This paper proposes an efficient method for estimating frame energy of speech from enhanced variable rate coder (EVRC) bitstream for network-based speech processing applications in transcoder free operation (TrFO) environments, where speech signals are represented as speech coding parameters. A frame of speech energy is decomposed into the energy of excitation and vocal tract filter, and the frame energy estimation method is derived for each component. Among many parameters of EVRC bitstream, the fixed codebook gain and adaptive codebook gain are used for the estimation of excitation energy, and line spectrum pair (LSP) information is used to estimate the energy of vocal tract filter. Experimental results demonstrated the novelty of the proposed method. The correlation coefficient between the actual and estimated frame energy can be maintained at a value of 0.994 with just 5% multiplicative operations of full decoding.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.