Abstract
A connected digit recognizer is proposed in which a set of isolated word templates is used as reference patterns and an unconstrained dynamic time warping (DTW) algorithm is used to literally spot the digits in the string. Segmentation boundaries between digits are obtained as the termination point of the dynamic path from the previous time warp. A region around the boundary is searched for the optimum starting point for the succeeding digit. At each stage the recognizer keeps track of a set of candidate digit strings for each test string. The string with the smallest accumulated distance is used as the preliminary string estimate. To help improve the recognition accuracy, two post-correction techniques were applied to the entire set of hypothesized digit strings. One technique creates a reference string by concatenating reference contours of the digits of the string, and comparing this to the test string using a constrained dynamic time warping algorithm. The second technique performs a similar comparison using voiced-unvoiced-silence contours instead of the measured features. Small but consistent improvements in recognition accuracy have been obtained using these techniques for both speaker-trained and speaker-independent systems with digit strings recorded over dialed-up telephone lines. For variable length digit strings of from 2 to 5 digits (where the recognizer was not told the length of the string), word error rates of about 2-3 percent and string error rates on the order of 8 percent were obtained for both speaker-dependent and speaker-independent systems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Acoustics, Speech, and Signal Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.