Abstract

The excellent performance in communications quality speech coding below 8 kbps achievable with the code-excited linear prediction (CELP) coders gives to this architecture a predominant role in medium-rate and low-rate speech coding, as evidenced by the adoption of several recent fixed-rate and variable-rate standards. Unfortunately, some of these CELP-based schemes are not completely described in the literature, and consequently they are difficult to understand and implement efficiently. This paper presents an original study of the G723.1 codec. The G723.1 encoder is dedicated to compress the voice signals with bandwidth up to 4 kHz efficiently and to deliver an encoded data stream with a very low binary rate and a good quality of transmitted speech (typical applications being encoding of the vocal signal for video conferences via GSTN and Voice over IP). We perform a detailed and gradually analysis, describing the MP-MLQ/ACELP speech coder from the point of view of a classical CELP structure. This approach allows us to identify (using theoretical considerations) the starting internal structure of each processing block from the encoder scheme. These results are used in breaking the main encoding algorithm loop. Finally, using the previously revealed starting internal structure, we derive the algorithm for the pitch predictor block, which is one of the most difficult parts of the ITU-T G723.1 encoder. The accompanying comments, explanations and diagrams allow efficient implementation and debugging of the corresponding software by regular DSP programmers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call