Abstract
Dynamic time warping has been shown to be an effective method of handling variations in the time scale of polysyllabic words spoken in isolation. This class of techniques has recently been applied to connected word recognition with high degrees of success. In this paper a level building technique is proposed for optimally time aligning a sequence of connected words with a sequence of isolated word reference patterns. The resulting algorithm, which has been found to be a special case of an algorithm previously described by Bahl and Jelinek, is shown to be significantly more efficient than the one recently proposed by Sakoe for connected word recognition, while maintaining the same accuracy in estimating the best possible matching string. An analysis of the level building method shows that it can be obtained as a modification to the Sakoe method by reversing the order of minimizations in the two-pass technique with some subsequent processing. This level building algorithm has a number of implementation parameters that can be used to control the efficiency of the method, as well as its accuracy. The nature of these parameters is discussed in this paper. In a companion paper we discuss the application of this level building time warping method to a connected digit recognition problem.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Acoustics, Speech, and Signal Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.