Abstract
In the past few years, a great deal of research has been directed toward finding acoustic features that are effective for automatic speech recognition. Until recently, most of the speech recognizers used about 12 cepstral coefficients derived through the linear prediction analysis as recognition features [ 11. In [2,3], Furui investigated the use of temporal derivatives of cepstral coefficients and energy as recognition features in a dynamic time warping-based isolated word recognizer and showed how the recognition performance improves with the inclusion of first derivatives in the feature set. These results were later confirmed in a number of studies for more general tasks (such as speaker-independent connected digit recognition and large-vocabulary continuous speech recognition) using the hidden Markov model (HMM)-based speech recognizers [4-61. More recently, some studies which advocate the use of second (and higher)-order temporal derivatives of cepstral coefficients for speech recognition have been reported [ 7-91. These temporal derivatives have also been found useful as recognition features for speaker recognition [lo-121. As a result, most of the present-day speech recognizers use a larger feature set for enhancing the speech recognition performance [13-X]. This feature set usually consists of cepstral coefficients and energy, and their derivatives. Though the addition of new features has improved the speech recognition performance, it has created some problems, too. For example, the recognizer using a larger (or, enhanced) feature set is computation-
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.