Abstract

Speech processing technologies, including speech recognition, synthesis, and coding are expected to play important roles in an advanced multimedia society with user-friendly human–machine interfaces. Speech recognition systems include not only those that recognize messages but also those that recognize the identity of the speaker. This paper predicts future directions in speech processing. It describes the most important research problems and tries to forecast where progress will be made in the near future and what applications will become commonplace as a result of the increased capabilities. The most promising application area is telecommunications. To solve various fundamental problems, a unified approach across speech recognition, synthesis, and coding is indispensable. Handling the common phenomenon of voice individuality from different aspects are: research on speaker adaptation in speech recognition, automatic speaker verification, voice conversion in speech synthesis, and the problems of very low-bit-rate speech coding quality variation from speaker to speaker. The methods that essentially solve such problems should be based on a common mathematical model. Research on the mechanism of speech information processing in our human brains, that is, how meanings of speech are understood and how speech is produced, is also crucial for epoch-making technological development in conversational speech understanding and natural speech synthesis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call